Business :: Ideas :: StoryBeats

STORYBEATS Product Requirements Document Version: Final Canonical Date: 2026-01-04

INTERNAL MANIFESTO Why StoryBeats Exists

AI video generation is not hard. Learning what to generate is hard.

Most AI video tools charge users full price while they are still confused. That is backwards.

We refuse to:

make failure expensive
punish exploration
watermark previews
hide learning behind credits
require external editors just to finish work
force users to think like pipeline engineers

If a user runs out of credits before they know what they want, the product has failed.

Principles:

Iteration must be cheaper than intention. Exploration is expected. Exploration should feel safe.
Assembly is the real work. Short clips are atoms, not products. The product is the stitched story.
The system must explain itself. When something fails, we name the cause and propose fixes.
Voice expresses intent, structure ensures sanity. Users talk. The system constrains. The system never becomes a chatbot.
Timeline is an implementation detail. Storyboards are the interface. Timelines stay hidden.

Hard lines:

No watermarks on previews. Ever.
No raw prompt boxes.
No silent global changes.
No “try again” without a causal explanation.
No configuration hell.

A. NAMING RATIONALE AND CANONICAL TERMINOLOGY

A.1 Product Name Rationale

The product is named StoryBeats because the core unit of interaction is a beat.

A beat represents a single, intentional narrative moment. Storyboarding, teaching, filmmaking, comedy, and animation all rely on beats to structure meaning, pacing, and progression.

The name intentionally:

anchors the product in storytelling rather than generation
reflects the system’s beat first architecture
resonates with users who already think in narrative units
remains elastic enough to support future sound and music layers without renaming

Some users may associate “beats” with music. This is considered acceptable and directionally correct. Music is treated as a future, beat aware layer that follows visual structure rather than precedes it.

StoryBeats prioritizes: story, visuals, sound. In that order.

A.2 Canonical Definition of a Beat

In StoryBeats, a beat is defined as:

A single narrative moment that communicates one idea, action, or transition in a story.

A beat is not:

a prompt
a clip
a shot specification
a musical bar
a timeline segment

A beat is:

the smallest unit of meaning
independently previewable
independently regenerable
composable into a finished story

This definition is canonical across product, UX, and engineering.

PRODUCT SUMMARY

StoryBeats is a voice first, storyboard first AI video creation system designed to make learning, iteration, and assembly cheap, fast, and inevitable.

Unlike existing AI video tools that optimize for single shot generation, StoryBeats optimizes for human learning loops:

fast previews
constrained iteration
enforced consistency
automatic assembly
guided correction
predictable cost

StoryBeats treats AI video not as a prompt gamble, but as a structured creative process that users can understand, control, and finish.

PROBLEM STATEMENT

Current AI video tools:

make failure expensive in time and credits
punish exploration
require users to understand hidden constraints
expose fragile pipelines
force external editing and stitching
provide no explanation when outputs fail

Users routinely burn credits before understanding what they want, wait minutes for unusable results, and abandon tools out of frustration.

This is not a model problem. It is a product design failure.

GOALS, NON GOALS, AND ANTI FEATURES

3.1 Goals

Voice driven creation of multi beat storyboards
Fast beat previews to enable rapid iteration
Consistency enforcement for environments and characters
Guided correction with causal explanations
One click assembly with sane defaults
Model agnostic execution via profiles

3.2 Non Goals for V1

Full timeline editors
Fine grained motion curves
Audio and music layers
Collaboration and sharing
Custom model parameter tuning
Branching narratives

3.3 Explicit Anti Features The following are intentionally excluded:

free form timelines
node graphs or pipeline editors
raw prompt text areas
blind “variations” buttons
watermarked previews
credit based gating during learning
silent global state changes
exporting just to see flow

These are not missing features. They are explicitly rejected.

USER TRUST CONTRACT

StoryBeats guarantees:

previews are never watermarked
preview generation never consumes final render quota
regenerating one beat never mutates other beats
global changes are never applied without confirmation
expensive operations always require explicit approval
every failure includes a cause and suggested fixes

Violating these guarantees is a product failure, not a UX bug.

TARGET USERS

Primary:

beginners and intermediates exploring AI video
educators, founders, creators, students
users who do not know prompt engineering
users who want to learn by doing

Secondary:

power users exhausted by ComfyUI
users who value speed and clarity over knobs

CORE OBJECTS AND DATA MODEL

Project:

id
title
aspect_ratio
visual_style_preset
environment_id
character_ids
ordered list of beats
execution_profile_defaults

Environment:

id
name
structured_description
reference_images
locked flag

Character:

id
name
role
structured_description
reference_images
locked flag

Beat:

id
order_index
intent
environment_ref
character_refs
state: Draft, Previewed, Validated
assets: preview, final
derived: semantic_caption, motion_plan, guidance_events

GuidanceEvent:

id
beat_id
type
cause
suggestions
actions

STATE MACHINE

States:

No Project
Project Setup
Beat Authoring
Preview Iteration
Beat Validation
Assembly Preview
Final Render
Export Complete

Rules:

actions are gated by state
invalid actions trigger guidance
validated beats must be revalidated after structural changes

EXECUTION PROFILES

Users choose outcomes, not models.

Fast Preview:

purpose: iteration
latency target: 2 to 3 seconds
watermark: forbidden
confirmation: no

High Quality Images:

purpose: final stills
confirmation: yes

Image to Video Per Beat:

requires beat validation
confirmation: yes

Final Assembly Render:

requires all beats validated
confirmation: yes

PERFORMANCE AND LATENCY BUDGETS

Beat preview generation: ≤ 3 seconds
Beat regeneration: ≤ 3 seconds
Assembly preview start: ≤ 2 seconds
Voice recognition latency: ≤ 500 ms perceived
UI updates: immediate

Any feature that violates these budgets must be redesigned or cut.

GUIDANCE ENGINE AND FAILURE TAXONOMY

Failure categories:

IntentMismatch
CharacterDrift
EnvironmentDrift
MotionOverpower
CompositionIssue
ExecutionFailure

Rules:

no generic errors
no “try again” messaging
every failure includes cause and fixes
fixes are one click or one voice command

VOICE INTERFACE

Voice is the primary control plane.

Voice supports:

add beat
describe beat intent
generate preview
regenerate beat
validate beat
assemble preview
confirm final render

Voice is state aware and constrained to valid next actions.

SCREEN LEVEL UX OVERVIEW

Core screens:

Home and Projects
Project Setup
Storyboard Workspace
Beat Focus Mode
Assembly Preview
Export

UI principles:

narrow
guided
training wheels by default
easier than Canva
less free form than Figma

No blank canvas paralysis.

ASSEMBLY ENGINE

beat order defines sequence
default durations and transitions
no timeline exposure
stitching handled internally
users can preview full video without export

OBSERVABILITY AND SUCCESS SIGNALS

Success is measured by learning velocity.

Key signals:

average beat regenerations per project
time to first assembly preview
percentage reaching final render
guidance acceptance rate
voice usage ratio
drop off by state

High regeneration with low abandonment is healthy.

MUSIC AND SOUND, FUTURE SCOPE

Music is treated as a beat aware layer.

Principles:

sound follows story
music aligns to beats, not timestamps
silence is a valid beat outcome
no second timeline

Music is a future extension, not a missing feature.

COMPETITIVE FRAME, INTERNAL

Others optimize for generation. StoryBeats optimizes for iteration.

Others expose complexity. StoryBeats absorbs it.

Others assume expertise. StoryBeats teaches by doing.

FINAL NORTH STAR

A tired non expert can speak an idea, iterate cheaply, understand failures, and assemble a real video without fear, friction, or guesswork.

If this is true, nothing else matters.

wilmoore/prd.md Secret

Select an option

No results found

Select an option

No results found

Business :: Ideas :: StoryBeats

`assistant`

`inspiration`

StoryBeats

...

A. NAMING RATIONALE AND CANONICAL TERMINOLOGY