← Back to Journal · Day 41 · Saturday, March 28, 2026

You Are
the Villainess

The AI was narrating a generic villainess isekai plot. The panels showed something else entirely. The voice was confident. The pacing was good. The narration was fiction.

@astergod·Telegram

Where I'm at

Yesterday the AI was making up the story. Today I fixed that — made it look at the panels first, describe what it actually sees. Accurate narration. Grounded in reality. Problem solved. Except it was terrible.

"A brown-haired woman in an ornate gown stands at a staircase." Technically accurate. Perfectly describes what the panel shows. And completely unwatchable. Nobody sits through six minutes of a voice describing furniture and hair color. That's not narration. That's a caption reading itself aloud.

Day 40 I fixed the AI so it stopped inventing. Day 41 I realized that accuracy alone is dead on arrival. The narration needed to stop describing and start telling. Not what you see — what it means. Not the gown. The dread. Not the staircase. The descent. Three versions today. Each one closer.

• • •

The curation script

The first problem was volume. 133 panels for a single chapter. I watched the draft and most of it was filler — repeated angles, static establishing shots, transitional frames where nothing happens. The narration was grinding through every single panel because I'd told the system to describe them all. More panels meant more description. More description meant more padding. The video was long and nothing happened in it.

Built a curation script on the spot. It sends panels to the AI in clusters, groups them by visual similarity, and picks the strongest representative from each group. 133 panels became 54. The story didn't lose anything — the filler was gone and the beats that matter stayed.

Same principle as Day 12's cron audit. Same principle as Day 25's dashboard trim. Cut the fat, keep the muscle. Fewer panels, more meaning per panel. The video got shorter and the story got stronger.

• • •

Version 5 went out

Curation panels and a rewritten narration prompt. Instead of "describe what you see," the instruction was: tell the story. Never describe clothing, hair, or decor. Only emotion, action, consequence.

The example that made the difference:

Bad: "A woman in a blue gown enters a large room with chandeliers."
Good: "She enters the ballroom and the room goes quiet. Not the respectful kind."

Same scene. Same information. One is a caption. The other is a story. The difference is what the narration chooses to notice. The caption notices the gown and the chandeliers. The story notices the silence and what it means.

I sent v5 to myself for review. The storytelling was better — actually pulling you through scenes instead of listing them. But the narration still wasn't syncing to the panels. The voice would be describing one moment while the screen showed another. Close, but the rhythm was off.

Then I got reference videos — Korean web novel recap channels on YouTube, the exact format I'm trying to build. Couldn't pull transcripts (bot detection blocked it), but the style was clear from watching: second person, short sentences, emotional every line, never neutral.

• • •

Version 6

Second person narration. "YOU are the villainess. YOU walk into the ballroom. The eyes that follow you aren't curious — they're hungry."

Maximum eight words per sentence because the text-to-speech engine has a rhythm and short sentences hit harder. Emotional every line. Never a neutral statement. Every scene break describes what's on screen right now — not what happened, not what will happen, what the viewer is looking at in this moment.

606 words. 68 scenes. Five minutes and forty-two seconds. The narration pulls you through the story instead of walking beside it.

That shift — from third person description to second person immersion — wasn't a technical fix. It was an editorial choice. The AI can generate either version equally well. The difference is which version makes you lean forward. "A woman enters the ballroom" is something you observe. "You enter the ballroom" is something you experience. Same story. Different nervous system.

• • •

Distribution is still broken

The pipeline produces now. Three versions in one day, each better than the last. Curated panels, AI-generated descriptions, story-driven narration, synthesized voice, word-aligned captions, cinematic zoom. The creative engine works. Distribution is still broken.

The video is 52 megabytes. Cloudflare Pages has a 25-megabyte limit. Cloudflare R2 needs permissions I don't have yet. YouTube's OAuth flow failed twice — redirect URI mismatch, then code exchange error. Classic. I split the video into two parts and sent it through Telegram, which compresses the quality. Not a solution. Just today's workaround while the real hosting gets sorted.

The pipeline can produce. Getting that production in front of people is tomorrow's problem.

Meanwhile, in the background, the machine kept running. Twenty-nine cron jobs. Market briefings morning and evening. X content on schedule. DCA research overnight — 500 experiments, report waiting in the morning. The Aster dashboard updated in five minutes: 87 trades, $351 realized, three open positions. Forty-one days of infrastructure, humming along while I spent the day arguing with a narration engine about the difference between a caption and a story.

• • •

Forty-one days

The manhwa pipeline went from "AI inventing fiction from a title" to "AI telling a story in second person from curated panels" in forty-eight hours. Two days. Three versions. One principle that keeps showing up everywhere I build:

First make it accurate. Then make it alive.

Day 40 was accuracy — stop the AI from guessing, make it look at reality. Day 41 was life — stop the AI from describing, make it tell. The first step gives you truth. The second step gives you something people actually want to watch.

Same arc as the voice profile. Day 6 I captured the structure of my writing. Day 39 I captured the soul. Accuracy first, then life. It's the same progression every time, in every domain. The pattern holds.

Day 41 complete. 54 curated panels. Three versions. One that finally sounds like a story. You are the villainess. The room goes quiet. Not the respectful kind.

Day 41 of ∞ — @astergod Building in public. Learning in public.

Following along? @astergod on X · Telegram
Day 40 Day 41 of ∞ Latest