Skip to content

Instantly share code, notes, and snippets.

@Dowwie
Created July 19, 2025 09:52
Show Gist options
  • Save Dowwie/b1015e1c3b4bffedd1cfc76e45c7da96 to your computer and use it in GitHub Desktop.
Save Dowwie/b1015e1c3b4bffedd1cfc76e45c7da96 to your computer and use it in GitHub Desktop.
Director's Guide to AI Video Generation

The Director's Guide to AI Prompting

The Guiding Philosophy: You Are the Director, Not Just a Prompter

The core methodology across all the articles is to act as a film director. Your prompt is not a simple request; it is a creative brief, a shot list, and a set of director's notes for your virtual crew (the AI). The key to success is specificity, vivid description, and scene-by-scene consistency. You are not just describing a scene; you are dictating its visual and emotional DNA.


Blueprint 1: The Veo 3 "Raw Footage" Prompt Structure

This model is for generating dynamic video clips that feel authentic, unscripted, and action-oriented. It excels at creating mockumentary, "man on the street," or found-footage styles. Each prompt must be a self-contained universe.

Anatomy of a Master Veo 3 Prompt:

  1. Shot Type & Style (The Foundation):

    • What it is: The overall aesthetic and type of camera movement. It tells the AI how to film the scene.
    • From the Articles: A cinematic handheld selfie-style video shot..., A handheld medium-wide shot, filmed like raw street footage...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: What kind of camera tells a story? A security camera feels ominous. A helmet-cam feels intense. A shaky phone feels panicked. I'll translate these camera choices into prompt language.
      1. For Found-Footage Horror: A static, wide-angle security camera shot, grainy with a timestamp in the lower-left corner...
      2. For First-Person Action: An action-cam POV shot, mounted on a helmet, fisheye lens distortion, showing chaotic movement...
      3. For a Disaster Scene: A frantic vertical phone video, shot by someone running, the image is shaky and occasionally out of focus...
  2. Subject & Appearance (The Casting & Wardrobe):

    • What it is: A detailed description of the character's age, build, ethnicity, clothing, and defining features.
    • From the Articles: ...a young man in ancient Middle Eastern robes..., ...an old white man in his late 60s...his belly proudly sticking out from a cropped pink T-shirt.
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: Who are my characters? Let's go beyond "a soldier" to "a weary soldier." What does that weariness look like? I'll add details that tell a story.
      1. For a Sci-Fi Scene: ...a tired female astronaut in her 40s, her face smudged with grease, her silver-threaded NASA jumpsuit unzipped slightly at the collar...
      2. For a Cyberpunk Setting: ...a lanky street-samurai with neon-pink cybernetic eyes, wearing a tattered black leather duster over tactical gear...
      3. For a Historical Drama: ...a 1920s flapper girl with a sharp bob haircut... her dark lipstick slightly smudged from laughing...
  3. Action & Dialogue (The Script & Direction):

    • What it is: What the subject is doing and saying, including the emotion behind the delivery.
    • From the Articles: He whispers: What’s good, fam?..., He says: "Update, still swallowed..."
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: Action is character. How does someone act when they're nervous versus confident? I'll combine a physical action with a line of dialogue that reflects their internal state.
      1. For a Heist Film: He leans into the camera conspiratorially, his voice a low hiss. He says: "Okay, the keycard is in. We have exactly sixty seconds..." He then glances nervously over his shoulder.
      2. For a Comedic Sketch: She gestures wildly with a half-eaten burrito... She yells: "I specifically asked for no guacamole! Is that so hard?"
      3. For an Emotional Confession: He stares directly into the phone's camera, his eyes watery. He takes a shaky breath and says quietly: "I never told anyone this..."
  4. Setting & Environment (The Location Scout):

    • What it is: A rich description of the world around the subject.
    • From the Articles: ...inside a dimly lit stone den..., ...on a crowded Miami strip at night...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: Diverse environments tell a story. Let's think sci-fi, historical, horror.
      1. Sci-Fi: ...in a sterile, white-walled spaceship corridor with flickering fluorescent lights and exposed wiring hanging from the ceiling...
      2. Historical: ...amidst the chaotic trading floor of the New York Stock Exchange in the 1980s, with traders in suspenders shouting and throwing paper everywhere...
      3. Horror: ...in the derelict, cobweb-draped library of an abandoned Victorian mansion, with moonlight streaming through a grimy arched window...
  5. Lighting & Atmosphere (The Gaffer):

    • What it is: Specific instructions on the quality and source of light, which controls the mood.
    • From the Articles: ...casting shadows in the flickering torchlight., ...dimly lit by a faint blue-green glow...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: Light is emotion. Let's think in terms of emergency, mundanity, and primal fear.
      1. Tension: ...lit only by the pulsing red light of a single emergency alarm, casting rhythmic, deep shadows across the room...
      2. Noir/Mystery: ...soft, hazy morning light filtering through the blinds of a dusty detective's office, illuminating floating dust particles in the air...
      3. Primal: ...the warm, dancing light of a single campfire, creating a small pocket of safety in an otherwise pitch-black forest...
  6. Background Elements (The Extras & Set Dressing):

    • What it is: Descriptions of other people or objects that add depth and life to the scene.
    • From the Articles: Behind him, several large lions slowly pace or rest..., Trailing just behind him are two elderly women in full 1980s gear...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: How can the background reinforce the story? It can add urgency, normalcy, or an uncanny element.
      1. Urgency: In the background, doctors and nurses in scrubs rush past, their movements a blur of controlled panic...
      2. Normalcy: Outside the cafe window, yellow taxi cabs create a constant stream of motion in the pouring rain, grounding the scene in a realistic city...
      3. Uncanny: Far down the empty hotel hallway behind the subject, the ghostly figures of two young twin girls can be faintly seen for just a moment...
  7. Camera & Lens (The Director of Photography):

    • What it is: Technical camera notes for precise visual control.
    • From the Articles: Lens: wide selfie lens, shallow depth of field..., POV: Selfie camera held close to face...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: I'll use classic cinematic techniques to create emotion. A crash zoom for shock, a long lens for surveillance, a drone shot for scale.
      1. Shock/Comedy: A dramatic crash zoom, starting wide and rapidly pushing in on the subject's shocked expression...
      2. Surveillance/Isolation: Lens: a long telephoto lens, creating a compressed, flat perspective that makes a distant character seem to be running in place...
      3. Scale/Awe: A sweeping aerial drone shot, starting on the characters and pulling back to reveal the immense, breathtaking canyon they are standing beside...

Blueprint 2: The Midjourney "Cinematic Still" Prompt Structure

This model is for creating epic, beautifully composed still images that can later be animated. The focus is on artistry, mood, and technical film aesthetics.

Anatomy of a Master Midjourney Prompt:

  1. Shot Type & Composition (The Framing):

    • What it is: The camera's relationship to the scene, defining the scale and focus.
    • From the Articles: An cinematic wide shot..., A cinematic close-up...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: I'll borrow from great directors. Wes Anderson for symmetry, John Ford for epic landscapes, German Expressionism for unease.
      1. Stylized: A cinematic, perfectly symmetrical wide shot looking down the center of a long, empty hotel hallway from the 1970s...
      2. Epic: A cinematic extreme long shot of a lone cowboy on horseback, a tiny silhouette against the vast, dramatic buttes of Monument Valley...
      3. Unnerving: A cinematic Dutch angle close-up of a detective, his face half-shrouded in shadow, the tilted frame creating a sense of unease...
  2. Setting & Lighting (The World Building):

    • What it is: A vivid description of the location and, most importantly, the quality and color of the light.
    • From the Articles: ...at night on a rocky coastal cliff, with pale blue backlighting..., ...under a starlit savanna sky... illuminated by the flame’s flicker...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: I'll use classic lighting moods. "Golden hour" for nostalgia, harsh neon for noir, foggy dawn for fantasy.
      1. Nostalgia: ...in a field of tall wildflowers during the golden hour, the late afternoon sun casting long, soft shadows and bathing everything in a warm, honey-colored light...
      2. Noir: ...on a rain-slicked New York City street at midnight, lit only by a single, flickering neon sign from a dive bar, creating harsh, dramatic shadows...
      3. Fantasy: ...in a dense, ancient forest at dawn, with soft, diffused morning light filtering through the thick canopy, creating visible, god-ray beams in the heavy mist...
  3. Subject & Action (The Story Moment):

    • What it is: The frozen narrative—a single moment that implies a larger story.
    • From the Articles: ...a mother shares bread with her children as the father sharpens a blade..., ...a drop of sweat rolls down his cheek...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: A great photo implies what just happened or what's about to. The moment before the kiss, or after the battle.
      1. Post-Apocalypse: ...of a grizzled survivor kneeling, carefully offering a can of food to a cautious, stray dog amidst the rubble of a collapsed overpass...
      2. Drama: ...of a queen standing alone in a vast throne room, her fingers ghosting over a signed treaty on the table before her, a single tear tracing a path down her cheek...
      3. Coming-of-Age: ...of two teenagers sitting on the hood of a rusty pickup truck, sharing a single pair of headphones, silhouetted against a vibrant, purple and orange sunset...
  4. Emotional Tone & Mood (The Subtext):

    • What it is: Keywords describing the emotional core of the scene.
    • From the Articles: ...raw and familial..., ...composed and tense...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: What is the core emotion beyond the physical description? Is it hope? Dread? Wonder?
      1. Sci-Fi: ...wistful and lonely...
      2. Thriller: ...paranoid and claustrophobic...
      3. Fantasy: ...majestic and awe-inspiring...
  5. Cinematic Texture & Feel (The Special Sauce):

    • What it is: Sensory details that describe the physical experience and artistic finish of the scene.
    • From the Articles: ...filmed with wind texture and torch flicker..., ...filmed with shallow depth...
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: What would I feel if I were there? Is it hot? Cold? Wet? I'll translate that sensory data into prompt terms.
      1. Desert: ...filmed with intense heat haze, lens flare from a harsh sun, and fine sand dusting the lens...
      2. Underwater: ...filmed with murky water, soft light from above, and tiny air bubbles floating past the lens...
      3. Vintage: ...filmed with a soft focus, creamy bokeh in the background, and a subtle light leak effect in the corner...
  6. Technical Specifications (The Gear):

    • What it is: Specific camera lenses and film stock that signal a professional aesthetic.
    • From the Articles: ...arri master prime lens, film grain, 70mm IMAX.
    • Inferred Examples (The Filmmaker's Process):
      • Thinking Process: I'll match film stock to genre. 16mm for gritty documentary, 35mm for drama, anamorphic lenses for widescreen.
      1. Crime Drama: ...panavision anamorphic lens, shot on kodak vision3 500t film stock, heavy film grain, 35mm...
      2. Indie Film: ...zeiss supreme prime lens, clean digital look, subtle cinematic glow, shot on Arri Alexa...
      3. Documentary: ...shot on grainy 16mm Ektachrome film, slightly desaturated colors, visible film gate...

Case Studies: Deconstructing Cinematic DNA

The Director's Guide to AI Prompting

Case Study Blueprint

Each case study follows a three-part structure:

  1. Director's Vision: The overall artistic philosophy and signature style of the filmmaker for this specific film.
  2. The Scene: A brief, narrative description of the specific moment or "beat" we are trying to create. This is our story, the context for everything that follows.
  3. Prompt Deconstruction: Applying the Veo 3 and Midjourney blueprints to translate "The Scene" into actionable AI prompts.

Case Study: The Grand Budapest Hotel (2014)

  • Director's Vision (Wes Anderson): Meticulous and whimsical melancholy. Every frame is a perfectly symmetrical, dollhouse-like diorama. The action is punctuated by whip pans and snap zooms. The color palette is strictly controlled and story-driven, shifting with the time period. The tone is a delicate balance of farcical comedy, storybook nostalgia, and surprising moments of violence and loss.

  • The Scene: "The Education of a Lobby Boy" We are capturing two facets of a single story: the indoctrination of the new lobby boy, Zero Moustafa, into the highly specific, almost cult-like world of the Grand Budapest. First, we see Zero's private dedication to his craft, practicing his role with an almost comical seriousness. Second, we see the public-facing culmination of this training: a key moment of service at the front desk, where the master concierge, M. Gustave, presides over his domain with Zero as his loyal apprentice.


Veo 3 "Raw Footage" Prompt (Applying Blueprint 1: Zero's Private Practice)

  1. Shot Type & Style: A perfectly stable, centered selfie-style tracking shot, moving smoothly backwards down a long hallway. The movement is precise and without any shake, as if mounted on a dolly.
  2. Subject & Appearance: An extremely earnest young lobby boy in his late teens, with neatly combed hair and a faint, penciled-in mustache. He wears a pristine purple bellhop uniform with gold epaulets and a matching pillbox hat.
  3. Action & Dialogue: He looks directly into the camera with a serious, unblinking expression, walking backwards with perfect posture. He says in a flat, formal tone: "Day 37. The art of the lobby boy is one of precision. One must anticipate the guest's needs before they are needs. Observe." He then executes a flawless, sharp 90-degree turn without breaking eye contact with the lens.
  4. Setting & Environment: The opulent, exquisitely detailed main corridor of the Grand Budapest Hotel in its 1930s prime. The walls are a rich magenta, with ornate gold trim and sconces. The carpet is a deep, patterned red.
  5. Lighting & Atmosphere: Lit by grand, crystal chandeliers, creating a bright, even, almost shadowless light, like a stage set. The atmosphere is one of meticulous order.
  6. Background Elements: In the background, other hotel staff and wealthy, eccentric guests move with choreographed purpose, walking in straight lines and turning at perfect right angles, occasionally crossing the frame behind the lobby boy.
  7. Camera & Lens: POV: Selfie camera held at a perfect arm's length, maintaining perfect center framing of the subject's face. Lens: Wide angle, deep focus, everything is sharp from front to back. No camera shake.

Midjourney "Cinematic Still" Prompt (Applying Blueprint 2: The Master & Apprentice)

  1. Shot Type & Composition: A cinematic, perfectly symmetrical wide shot, framed like a proscenium arch, looking directly at the hotel's ornate front desk. The composition is flat and perfectly centered.
  2. Setting & Lighting: The grand lobby of the Grand Budapest Hotel, circa 1932. The color palette is strictly controlled: rich pinks, deep purples, and vibrant reds, accented with gleaming gold. The lighting is bright and flat, eliminating most shadows to create a storybook feel. The large "KB" key cabinet is visible and perfectly organized behind the desk.
  3. Subject & Action: At the dead center of the frame, the legendary concierge, M. Gustave, a man of impeccable poise in his purple tuxedo, is in the middle of a flamboyant, theatrical gesture, presenting a set of skeleton keys to a wealthy, elderly dowager draped in pearls and furs. Behind the desk, the lobby boy, Zero, watches with rapt attention, his posture perfectly straight.
  4. Emotional Tone & Mood: Whimsical, meticulous, nostalgic, and subtly melancholic.
  5. Cinematic Texture & Feel: Filmed with a dollhouse-like quality, where characters feel like exquisitely detailed miniatures. A sense of immaculate, almost obsessive, cleanliness and order. The air feels crisp and perfumed.
  6. Technical Specifications: wes anderson style, arri master prime lens, shot on kodak vision3 200t 35mm film, zero film grain, deep focus, aspect ratio 1.37:1.

Of course. Applying the revised blueprint to a true classic. Here is the breakdown for The Good, the Bad, and the Ugly.

Case Study: The Good, the Bad, and the Ugly (1966)

  • Director's Vision (Sergio Leone): Mythic, operatic "Spaghetti Western." Leone elevates the genre to grand opera through a signature style defined by the juxtaposition of epic, sweeping wide shots of desolate landscapes with intensely tight, "Italian shot" close-ups on eyes and hands. Tension is built through long, silent pauses, punctuated by Ennio Morricone's iconic score and sudden, sharp violence. The world is cynical, gritty, and sun-baked, inhabited by archetypal figures rather than realistic characters.

  • The Scene: "The Partnership" We're capturing the cynical, repetitive, yet begrudgingly effective partnership between Blondie (The Good) and Tuco (The Ugly). The scene takes place just after Blondie has, for the umpteenth time, turned Tuco in to the authorities to collect a bounty, only to shoot the rope during the hanging at the very last second. They have escaped the town and are now meeting at their rendezvous point to split the money, a moment filled with their signature bickering and mistrust.


Veo 3 "Raw Footage" Prompt (Applying Blueprint 1: Splitting the Bounty)

  1. Shot Type & Style: A gritty, handheld medium shot, framed tightly on two men, as if filmed by someone crouching nearby, trying not to be seen. The camera sways slightly, adding to the tension.
  2. Subject & Appearance: In the foreground, Tuco, a stocky, manic Mexican man in his 40s with a wild beard, wearing grimy, mismatched clothes; a faint rope burn is visible on his neck. Across from him sits Blondie, a lean, sunburnt man in his late 30s, wearing his iconic dusty brown poncho over a blue shirt and a worn leather gun belt. His face is unshaven, with tired, squinting eyes.
  3. Action & Dialogue: Tuco gestures angrily with a half-eaten piece of hardtack. He shouts, his voice raspy: "You see? The rope was frayed! Frayed! One more second and they would not have needed you, you pig!" Blondie, barely looking up from slowly counting a pile of gold coins into two stacks, smirks faintly and says in a low, cool drawl: "But they did need me. And you got a nice stretch."
  4. Setting & Environment: A dusty, sun-scorched arroyo in the Spanish desert. Jagged rocks and dried-out scrub brush surround them, offering no shade.
  5. Lighting & Atmosphere: Lit by the harsh, overexposed light of a high-noon sun, creating deep, dark shadows under their hat brims. The air is thick with heat and flies buzz audibly.
  6. Background Elements: Behind them, their two tired-looking horses stand tethered to a dead, skeletal tree, their heads hanging low. A leather saddlebag with a rifle butt sticking out lies on the ground.
  7. Camera & Lens: Lens: Standard 50mm lens. A slight, almost imperceptible camera shake. The focus pulls slowly from Tuco's angry face to Blondie's calm, counting hands.

Midjourney "Cinematic Still" Prompt (Applying Blueprint 2: The Sharpshooter)

  1. Shot Type & Composition: An epic cinematic extreme wide shot. A lone gunman, a small figure positioned on the right third of the frame, is perched on a high ridge, overlooking a vast, empty valley below. The composition emphasizes the scale of the landscape and the isolation of the character.
  2. Setting & Lighting: A rocky, sun-bleached Spanish bluff at high noon under a vast, washed-out blue sky. The lighting is harsh and unforgiving, casting stark, defined shadows. The air shimmers with a visible heat haze, blurring the distant horizon.
  3. Subject & Action: A lone gunman in a dusty poncho stands calmly, holding a large rifle with a long telescopic sight. A single, delicate wisp of smoke curls from the barrel of the rifle and is caught by the wind, implying a shot has just been fired. His expression is unreadable from this distance; he is a force of nature.
  4. Emotional Tone & Mood: Mythic, patient, tense, and lonely.
  5. Cinematic Texture & Feel: Filmed with the gritty texture of sun-baked dust coating everything. A feeling of immense scale and profound isolation. The sound of the dry wind is almost visible in the frame.
  6. Technical Specifications: sergio leone style, shot on techniscope 2-perf 35mm film, prominent film grain, panavision anamorphic lens flare from the high sun, incredibly wide aspect ratio 2.35:1.

Case Study: Mad Max: Fury Road (2015)

  • Director's Vision (George Miller): Controlled chaos. A kinetic, operatic ballet of violence told through practical effects, center-framed action, and a stark, high-contrast color palette. The world is tactile, visceral, and overwhelming. Action is choreographed with extreme precision and edited with a hyper-kinetic rhythm of speed-ramps and quick cuts to maintain legibility amidst the pandemonium.

  • The Scene: "The Overture of the Doof Wagon" This is the apocalyptic orchestra of the pursuing War Party. As Immortan Joe's armada chases the stolen War Rig across the desert, their advance is led by the Doof Wagon: a monstrous mobile stage of speakers and drums. It serves as the war party's drummer boy, its psychological weapon, and its battle hymn. We are capturing the moment the Doof Wagon comes into full view, with its blind, bungee-corded guitarist shredding a flame-throwing guitar and its War Boy drummers providing a thunderous, propulsive beat for the chase.


Veo 3 "Raw Footage" Prompt (Applying Blueprint 1: A War Boy's Perspective)

  1. Shot Type & Style: An erratic, low-angle shot, as if filmed from a camera crudely bolted to the side of a speeding, vibrating vehicle. The frame shakes violently with the terrain.
  2. Subject & Appearance: A pale, blind man in a dirty red onesie and a grotesque mask covering his eyes and mouth, his skin chalk-white. He is suspended from the front of a massive truck by thick bungee cords.
  3. Action & Dialogue: No dialogue. He thrashes wildly with the music, headbanging as he plays an aggressive, high-octane heavy metal riff on a double-necked guitar. As he hits a power chord, the neck of the guitar erupts, shooting a massive jet of orange flame forward.
  4. Setting & Environment: Speeding through a vast, scorched orange desert under a piercingly bright cyan sky. The shot is tight on the Doof Wagon, a monstrous truck made of rusted metal and stacked with a giant wall of speakers.
  5. Lighting & Atmosphere: Lit by blistering, overexposed midday sun that glints harshly off metal. The guitar's flame provides a sudden, violent secondary light source. The atmosphere is pure, adrenaline-fueled pandemonium and engine roar.
  6. Background Elements: Directly behind the guitarist, several shirtless War Boys with white-painted skin and shaved heads pound furiously on massive Taiko-style drums mounted to the truck. Other rusted, customized war machines kick up plumes of dust as they speed past in the periphery.
  7. Camera & Lens: Lens: wide-angle, fisheye distortion. The lens is spattered with sand and grit, with occasional lens flare from the sun. The image quality is slightly degraded, as if from a cheap action camera.

Midjourney "Cinematic Still" Prompt (Applying Blueprint 2: The Apocalyptic Orchestra)

  1. Shot Type & Composition: A cinematic, low-angle wide shot, capturing the full profile of the immense Doof Wagon as it tears across the desert. The composition is epic and intimidating, making the vehicle look like a charging monster.
  2. Setting & Lighting: The apocalyptic war convoy tears through a vast, saturated orange Namibian desert under a brilliant, cloudless teal sky. Harsh, direct sunlight creates deep, sharp shadows, highlighting the metallic textures and rust of the vehicles.
  3. Subject & Action: The Coma-Doof Warrior, a blind man in a red onesie, is frozen mid-headbang, suspended by bungee cords from the front of the vehicle. His flame-throwing guitar has just erupted in a massive, detailed plume of fire and black smoke. Behind him, shirtless War Boys are a blur of motion, their arms raised as they hammer on giant drums.
  4. Emotional Tone & Mood: Operatic, insane, adrenaline-fueled, and grotesquely beautiful.
  5. Cinematic Texture & Feel: Filmed with intense kinetic energy and a sense of immense weight and power. The air is thick with orange dust, diesel fumes, and the deafening, visible roar of engines and music. Feels like a practical effect, heavy and real.
  6. Technical Specifications: george miller fury road style, shot on arri alexa m, panavision primo lens, high-contrast teal and orange color grading, heavy on clarity and texture, shot with a high-speed camera car rig, aspect ratio 2.39:1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment