In a frustrating round of fifteen image generations using text assembled with KAI and more often than not a visual prompt created from a collage of images in Adobe Express, I had hoped to create a shot, or a sequence of shots, that would depict a Haida creation myth-like emergence of one or all three of ‘my’ homanoid, amphibian aliens from interstellar birth pod meteorites. Through a combination of assumptions and poor phrasing, a wide collection of peculiarities occurred. In an effort to understand what works and why, and what does not work, I’ve compiled all fifteen clips here.
When using a visual prompt, these three images were used:



Soon after the above was inadvertently produced within ChatGPT while working to generate an image to create a video of this kind of scene I concluded that the best way foward was to:
Keep each shot simple, the complexity created in the edit of these shots.
Generate an image such as this, and if and only when happy with that, use it as the reference visual to avoid the kind of random generations that otherwise occur.
Rather than heed my own advice, I continued adjusting the text, creating Image to Video in Veo3, all intended to last 8 seconds at 720p with a sound prompt and requiring 200 tokens. Occasionally, I failed to produce the image or defaulted to 5 seconds.
Instead of following our instructions, Veo3 has the pod moving like a rock emerging from the water, gaping open as if it has a mouth! How can it ignore the provided instructions?
KAI explained that ‘This is a frustratingly common issue with Veo 3 and other AI video generators: the model interpolates motion and meaning from visual metaphors, especially when those metaphors—like a mossy, gnarled pod—resemble mouths, cocoons, or biological creatures. Here’s why it likely happened and how to fix it:
Why Veo Ignored Your Instructions:
Pareidolia in the Prompt + Image: The pod’s open seam and organic mossy covering led the model to interpret it as a creature rather than as an object that contains creatures.
Lack of Anchored Object Function in Prompt: “Pod” wasn’t established as inanimate in purpose—Veo interpreted “it stirs, it cracks” as signs of sentience, viewing it as emerging or waking up itself, rather than something stirring inside it.
Veo Prioritises Visually-Obvious Motion Arcs: Since the pod’s form resembles a mouth and the prompt states “cracks” and “shifts,” Veo automatically assigned it a “mouth opening” arc, which is a known visual shorthand it has been trained to use in metamorphic or nature-fantasy sequences.
Instead of adhering to our instructions, Veo3 presents the pod as emerging like a rock from the water, gaping open as if it possesses a mouth! How can it disregard the provided instructions?
KAI: This is a common issue with Veo 3 and other AI video generators: the model interpolates motion and meaning from visual metaphors, particularly when metaphorical elements—such as a mossy pod—resemble mouths or creatures. Here’s why it likely occurred and how to rectify it:
How to Correct for the Next Attempt:
Here’s a revised text-to-video prompt designed to block Veo from animating the pod as alive itself and instead focus on its symbolic containment:
Revised Prompt for Veo3 (Image-to-Video)
A heavy, algae-covered seedpod lies motionless at the water’s edge, clearly artificial and inert—part seed, part woven container. Mist flows around it, silent. The pod trembles slightly, not of its own will. From inside, something stirs. A faint blue glow pulses through its seams. One crack splits across the top—thin, precise. A large raven lands beside it, head tilted. It taps at the cracked shell, coaxing the light inside to grow.
Key changes:
- “clearly artificial and inert,” = tells Veo, “not a creature.”
- “trembles slightly, not of its own will” = clarifies external animation
- “part seed, part woven container” = gives Veo a material identity.
- “thin, precise crack” = implies design not biological features
- “Raven taps at the cracked shell” = clear action focus.
Optional Prompt Ending to Freeze It:
The raven pauses. The glow steadies. The pod does not open—but something inside begins to move.
This keeps the pod closed and delays emergence until the next scene.
Did I follow these instructions? Did I heck. As with all things in life everything is trial and error with fixes picked up far too slowly. This will prove an expensive way to generate finished sequences. Rather I need to read, understand the logic, check it, rewrite if I see fit and see how well it works or not.
The next prompt went as follows > The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downward, its wings extended. It lands beside the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the pod to jolt and crack slightly, reminiscent of a hatching egg. The mist thickens. The raven observes intently, embodying both curiosity and patience. And with few variations produced the following (without image reference):
With image reference:
And then an still image from ChatGPT which should be the way to go, get a still image right, that build it from there > The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downward, its wings extended. It lands next to the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the stone-like pod to jolt and fracture. An infant with strangely marbled, newt-like skin emerges.. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.

Video like this >
I should have been content with this, but instead I create a collage reference image with the lakeside, the Raven and the pod:

And tweak the language, one word ‘alien’ which tipped Veo3 into cliché using this text prompt`; The pre-dawn mist clings low over the edge of a woodland lake. Upon the moss-covered bank, a substantial, algae-coated, gnarled stone-like pod twitches from inside subtly. A fine fracture appears, followed by another. From above, a solitary colossal raven glides silently downward, its wings outstretched. It lands beside the pod and tilts its head to one side. The giant raven pecks at the crack with its beak, causing the stone-like pod to convulse and splinter. A humanoid alien infant water creature emerges, followed by a second and a third humanoid alien infant water creature. The mist thickens. The raven observes with keen interest, embodying both curiosity and nurturing.
Then with a prompt reference that kept it more human-like: The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downwards, its wings extended. It lands beside the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the stone-like pod to jolt and fracture. An infant emerges from the newly created crack, while two others inside the pod begin to surface. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.
My patience being tested when my carefully crafted watersprite infant simply become a crow. Veo3 making assumptions based on the word ‘mother’ her. Mother Raven, means the hatchling has to be a Raven as well.
The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downwards, its wings extended. It lands next to the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the stone-like pod to jolt and fracture. An infant tentatively pulls itself from the interior to take its first look at the world and its raven mother. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.
Back to the humanoid infant, but it is too large, the Raven too small. Veo3 cannot take these nuances. The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downwards, its wings extended. It lands next to the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the stone-like pod to jolt and fracture. A small, beautiful female infant humanoid child peeks out, sniffs the air, and lifts its hand to sense its new world through the newly created crack. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.
I should slow down, move on. But like a director stuck in a loop of asking for an umpteenth take that is what I do. What I would do, when shooting film decades ago, was to turn up on set with a storyboard – every shot mapped out, the cast chosen, the crew at my back. I could keep every shot to only a few takes. I need to revert to this, as I do when creating a relief print: prep and planning is everything, experiment with a pencil on a piece of paper – its cheaper, more effective, more effective and less frustrating.
So, the following prompt, with no visual reference but only slightly tweaked text Therefore, the subsequent prompt, lacking any visual reference but featuring only slightly altered text, produced this:produced this>
But then from this > The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downwards, its wings extended, lands next to the pod, cocks its head to one side, and taps the crack with its beak, causing the stone-like pod to jolt and fracture. A tiny, beautiful female infant humanoid child peeks out, sniffs the air, and lifts its hand to sense its new world through the newly created crack. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.
I believe Veo 3 is adding unnecessary elements. Specifying ONE Raven still generates two. Specifying ONE infant emerging inexplicably produces a second infant that disappears before the new one emerges. It feels like it wants me to err, generating images like an addicted gambler. This is the last text I used: The pre-dawn mist curls over the woodland lake’s edge. On the moss-covered bank, an algae-encrusted pod twitches faintly. Fine cracks appear. A giant raven glides down, landing next to the pod. It cocks its head and taps the crack with its beak, causing the pod to jolt and fracture. A tiny, beautiful female humanoid child peeks out, sniffs the air, and lifts her hand to sense her new world. The mist thickens. The raven observes intently, embodying curiosity and patience.
KAI explained that what I’m experiencing with Veo 3 is a common frustration for creators who are after precision, not generative indulgence.
Here’s what’s happening and how to take control:
Why Veo 3 Keeps “Gambling”
Model Bias Toward Visual Complexity
Veo prioritises spectacle and often interprets your prompts with a cinematic instinct: more drama, more action, more subjects — even when explicitly instructed otherwise.
Text-weighting is flat
The phrase “one raven” or “a single child” doesn’t always override what the AI thinks makes a better-looking or more emotionally charged shot. If one child is good, it implies that two might be even more meaningful.
It anticipates animation arcs.
Veo is often trained on footage structures that evolve or cycle over time. That’s why it may introduce a second child early, then fade it out—it’s thinking in animation beats, not literal single-frame logic.
REWRITE
What You Can Do to Rein It In
Here’s a refined prompt with stronger negation language, reinforced subject focus, and cutscene logic:
Corrective Prompt – Cleaner Version for Veo 3
The pre-dawn mist curls low over the edge of a tranquil woodland lake. On the mossy bank, a solitary, algae-encrusted pod twitches once. A crack forms—thin and sharp. From above, a lone raven glides silently downwards and lands beside the pod. No other birds are present. The raven tilts its head and taps once with its beak. The pod jolts and splits. A tiny, beautiful infant girl—alone—peeks out slowly through the new gap. There are no other children nearby. She lifts one hand, sensing the air for the first time. The mist deepens. The raven remains still. It watches, only watching.
Reinforcement Strategies:
“No other birds are present.” > Hard-blocks Veo’s urge to add a second raven
“There are no other children nearby”> Eliminates the twin effect or generative clustering.
“A single…” (early and late)> Doubles down on count priming
Slow phrasing, no verbs that imply sequence > Prevents buildup-to-payoff spiral logic
Bonus Tips for the Next Round:
Use phrases like: “There is only one…” or “The scene shows only…” at the top of the prompt.
Try prompting in reverse emphasis order: start with “Only one infant is visible, emerging,” then build the atmosphere.
Avoid terms like “peeks out” without control — they often trigger a cut to something else.
Which produces an overly large, too-human infant. And then, without any instructions to do so, using a different visual reference, completely ruining the shot, Veo3 turns the stone/algae-covered pod into a wicker basket! How do I stop it from doing this? If I only get one usable sequence out of twelve, then I am starting to wonder if it is worth the effort.
Am I going around in circles? Not me. Spiralling like a chair caught up in the vortex of a tornado, not descending into the pit of a whirlpool. In my chaotic way my experience is improving.
ChatGPT suggested the following:
Why Veo 3 “Turned the Pod Into a Wicker Basket”
1. “Pod” + “Infant” = Pretrained Association
Veo sees:
- Pod → containment object
- Infant → crib, Moses basket, womb
Together, the AI’s training corpus leads it to this unhelpful visual shorthand:
“Basket with baby = safe = archetype = render that instead”
2. Lack of Material Enforcement
Without a hard-material description, it defaults to a warm-toned, home-based “pod.”
Even your correct description (“stone-like,” “algae-encrusted”) is being overwritten by the infant imagery.
3. Visual Substitution Bias
Veo “thinks”:
- “What do humans expect a newborn to be inside?”
- What’s photogenic?”
- → Basket it is.
🔧 How to Stop This
You need explicit material anchoring + exclusion language like this:
Corrective Prompt Phrase
The pod is not a basket. It is made of dark, cracked stone and covered in algae. There is no wicker. No weave. No rope. The pod resembles a boulder, not a container. It is natural, ancient, and overgrown.
Then, incorporate:
A faint crack splits across the pod’s surface—stone, not wood. The child inside is glowing softly. There is no fabric, no cloth, no basket. The setting is primaeval, not domestic.
Prompt Control Framework
Aspect What to Include What to Exclude
Pod material “Cracked stone,” “gnarled bark,” “meteor-like surface,” “moss and algae crusted,” “Wicker,” “basket,” “fabric,” “woven.”
Setting “Ancient forest,” “primaeval pond,” “Haida-inspired natural world,” “Domestic,” “nursery,” “indoor,” and “crafted.”
Tone “Mythic,” “ritual,” “cosmological,” “Cute,” “baby scene,” “crib.”
Want a Fix? Try This Line:
This is not a scene of comfort. It is a scene of awakening. Nothing human-made appears. The pod is mineral. The child is a myth.
Final Thought
If you’re getting 1 usable shot out of 12, you’re not failing — you’re operating like a story designer in an unreliable studio.
The difference now is that you know the bug. And I can help you kill it.
Would you like me to build:
- A “Veo 3 Do-Not-Render” phrase library?
- A “Mythic Prompt Correction Sheet” to copy/paste per scene?
A batch of pre-controlled scene templates that you can trust?
And then, not fix at all, as the infant appears like baby Jesus with a halo originating from a stone pod that transforms into a concrete egg, and to top it all off, the infant levitates from the pod as if it were a cherub ascending towards the clouds. This is image generation thirteen.
Changing the ethnicity of the infant produced a lovely, but too human-child-like image >
A massive raven perches beside a cracked stone pod, crusted with algae. The pod sits heavy on the mossy lakebank—mineral, not smooth, not clean. No wicker. No eggshells. Inside, a small dark skinned childlike figure stirs. She is tiny—only the size of the raven’s head—and her pale, marbled newt-like skin glistens with water, not light. She does not glow. She does not rise. She slowly pushes herself upright with effort. Her hand reaches upward clumsily, as if learning what air is. The raven remains still. There are no halos, no shining light, no spiritual symbols.
The pre-dawn mist curls low over the edge of a woodland lake. Upon the moss-covered bank, a large, algae-encrusted, gnarled pod twitches faintly. A fine crack materialises, followed by another. From above, a solitary giant raven glides silently downward, its wings extended. It lands next to the pod and cocks its head to one side. The giant raven taps the crack with its beak, causing the stone-like pod to jolt and fracture. An infant with strangely marbled, newt-like skin tries to climb out. The mist becomes denser. The raven observes intently, embodying both curiosity and patience.
And further insights generated by ChatGPT (which always has an answer):
What Veo Assumes Without Being Told
1. Pod Opening Without Cause
Since the raven is not described as doing anything, and no force is mentioned, Veo treats the pod as autonomous, defaulting to the smoothest form: a perfectly split egg.
Fix: Reintroduce mechanical/organic resistance.
“The pod does not open cleanly. Its edges are uneven and jagged. There is no perfect rim. No shell. The crack widens slowly as if resisting. It opens only after pressure builds from within.”
Why the Raven Isn’t Stopping the Pod From Overperforming
2. Raven’s Role Is Passive
The raven is still, which is thematically correct, but leads Veo to think, “Raven is irrelevant.”
Fix: Give the Raven a small trigger.
“The raven taps the pod gently with its beak before stepping back. This causes a seam to form, but the pod resists, shuddering instead of opening quickly.”
What You’re Competing With: Veo’s Internal Visual Grammar
Pod = an object with a hinge or seam. Override With…“There is no hinge. No clean break. The stone splits irregularly.”
Glowing child = divine light. Already fixed with “no glow / no halo”
Action must escalate. “Nothing dramatic happens. The moment is slow. Subtle.”
The container must open. “The pod barely opens. Only a small gap forms—enough for a single hand to emerge.”
Revised Block: Minimal Movement + Cause
The pod does not spring open. Its surface cracks along an old seam, uneven and jagged. There is no hinge. No perfect rim. The raven leans forward and taps it gently with its beak. A tremor runs through the stone, then stillness. Slowly, the crack widens just enough for a small hand to emerge. That is all.
Summary
You’re doing well with your wording; it’s now about giving Veo:
A cause-and-effect triggerA veto on perfect geometry




Leave a Reply