Where I'm at
Yesterday I said the foundation comes first. Today I started building it — and immediately found out the foundation I thought I had was wrong.
The backtest was showing 87 liquidations across five assets over 30 days. Eighty-seven. My live bots don't liquidate 87 times in a month. Something was broken in the simulation.
The bug was the starting balance. I'd given each asset exactly twice the margin needed for a full position. That sounds conservative if you've never run a leveraged trading system. It's not. In reality, the wallet absorbs losses across multiple layers before the exchange forces a liquidation. My simulator was starting each asset with barely enough runway, then collapsing the moment any real volatility hit.
The backtest wasn't simulating my strategy. It was simulating a version of my strategy with no safety margin — which, come to think of it, is exactly what happened to the clients who got liquidated on Day 33. Same lesson. Simulator and reality, both telling me the same thing: not enough cushion for the worst case.
Fixed the starting balance to three times the required margin. Used the exchange's actual liquidation formula instead of the hardcoded threshold I'd been using. Eighty-seven liquidations became six. That's what the live bots actually experience. Now the simulator matches reality, and I can trust what it tells me.
• • •
The loop that picks for me
Then I built something I didn't plan to build.
I'd been picking the bot parameters by feel. How many layers before the bot stops adding? What percentage drop triggers the next buy? What entry threshold off the 24-hour high? I'd been choosing these numbers based on experience, adjusting them when they felt wrong, and testing them by running live money through them.
That's how I've done it for thirty-five days. It's also how clients got liquidated. My intuition about gold was wrong. My sizing for silver was wrong. The numbers I picked by feel worked in calm markets and failed under stress.
So I stopped picking. I built a loop that picks for me. The idea is borrowed from machine learning research — Karpathy calls it autoresearch. You take your system, define a scoring function that measures what "good" looks like, and then let the machine test hundreds of parameter combinations automatically. Change one thing. Run the test. Keep what's better. Discard what's worse. Log everything. Repeat.
Ten experiments ran in ninety seconds. The score improved by 21%.
The biggest finding was the one I wouldn't have guessed: gold's maximum layers should be 5, not 10.
Gold was getting destroyed by volatility in this data window. Every time price dropped hard, the bot kept adding layers — deeper and deeper — and when it couldn't recover fast enough, it liquidated. The fix wasn't adjusting the entry timing or the step size. It was capping how deep the hole could get. Five layers, not ten.
The machine found it in ninety seconds. I'd been running ten layers on gold for weeks and never questioned it because the number felt right.
That's the difference between intuition and data. Intuition says "ten layers should be fine, gold doesn't move that fast." Data says "gold moved that fast on March 13 and wiped out your clients." The machine doesn't have intuition. It has experiments. And experiments don't lie about what they find.
Tonight the machine runs 500 more experiments while I sleep. By morning I'll know what the optimal entry threshold and step percentage should be — the two parameters I haven't touched yet. If the first ten found the gold layer cap, what will five hundred find?
• • •
The symbol scanner
The same principle led somewhere I didn't expect.
I had the agent build a symbol scanner — a tool that scores every trading pair on the exchange against four criteria: daily price range, recovery rate after drops, trend direction, and simulated profit. Three hundred and sixteen pairs scored and ranked in five minutes.
ETH came out on top. Score: 8,593 out of 10,000. Two hundred and sixty-five million dollars in daily volume. Ninety-four percent recovery rate. Five-point-four percent average daily range. The obvious choice — once you do the math across 316 symbols. I wouldn't have done that math manually. I would have picked based on what I knew and what I'd heard.
The machine picked based on what the data said. The ETH bot is running now. Waiting for a 1% pullback from the 24-hour high to enter Layer 1. I didn't decide to add ETH. The research did.
That's a shift I keep noticing. The decisions I'm making aren't "which parameters should I use" or "which assets should I trade" anymore. They're "how should the machine decide these things." I'm not the optimizer. I'm the architect of the optimization. The system finds the answers. I design the system that finds them.
• • •
The correction tracker
One more thing that happened today, quieter but maybe more important long-term.
I've been correcting my AI agent on the same two things for thirty-five days. Check the chat history before responding — don't make me repeat context I already gave you. And don't ask me the same question twice in a session. Same corrections. Over and over.
Today I built a tracker that logs every correction I make and categorizes it. Thirty-five days of corrections, sorted and counted.
Forty-two percent of all corrections fall into two categories. Check history first. Don't repeat yourself. Forty-two percent.
Nearly half of every time I've had to correct the agent traces back to the same two structural issues. That's not a rules problem. The rules exist — they're in the agent's instruction files, they've been there since Day 4. It's a structural problem in how the instructions are written. The agent reads them, acknowledges them, and then skips them under load.
Same pattern from Day 5, when the trigger phrase lasted exactly one day. Same pattern from Day 9, when rules in a file weren't the same as rules in a prompt. The tracker doesn't fix it. It makes the pattern visible. And visible is the first step toward fixable.
The proposals it generated aren't perfect yet — suggestions, not solutions. But the machine is pointing at the right problem. Which is more than I was doing by repeating the same correction for the 47th time and hoping it would stick.
• • •
Day 35
Thirty-five days. Five weeks.
Day 33 the strategy broke because I was guessing at the right parameters. Day 34 I reset everything and committed to finding them properly. Day 35 I built the tools that find them — and in ninety seconds they showed me something I'd been wrong about for weeks.
The foundation isn't a number. It's a process for finding the number. The autoresearch loop. The symbol scanner. The correction tracker. Three systems that make the machine smarter without me manually adjusting anything.
Tomorrow I check what 500 experiments found overnight. The machine is working while I sleep. That's been the goal since Day 1 — except now it's not just trading while I sleep. It's learning while I sleep.
Day 35 complete. Eighty-seven simulated liquidations became six. Gold max layers: 5, not 10. ETH added by data, not gut. Five hundred experiments queued for tonight. The machine doesn't guess. That's the whole point.
Day 35 of ∞ — @astergod Building in public. Learning in public.