The three that didn’t

2. Dog-walker marketplace (the $118 lesson)

Agent built the UI, the seeker dashboard, the walker onboarding, the search. All decent. The moment I asked it to wire Stripe Connect for marketplace payouts, it spent 38 minutes generating, rewriting, regenerating four versions of the webhook handler, each one almost correct, none of them actually able to onboard a connected account. The bill ran while it deliberated.

This is the “effort-based” pricing in practice. The harder the agent thinks, the more you pay, whether or not the thinking lands. I should have stopped it at the 20-minute mark.

5. Receipt manager (iOS Shortcut + web)

I killed this one. The brief required Agent to scaffold a web companion that knew about an iOS Shortcut it could not see. The agent kept asking clarifying questions, then trying to design around them. Each clarification cost. I burned $89.60 to learn that the agent doesn’t really do cross-platform spec inference well.

7. Agency CRM (crashed three times)

The most expensive failure. Agent built a perfectly fine Next.js + Prisma CRM with roles, deal pipelines, and an inbox. It would not deploy. Migrations failed on the production Postgres each time. The agent’s autonomous debugging — much hyped — got into a loop where it would “fix” the migration by adding a column the schema didn’t expect, then “fix” the new error by removing a column it now needed elsewhere. Three full rebuild cycles. $135.77.

I gave up and shipped it to a Railway box manually. Took me 22 minutes. So the Agent did 95% of the work and then immolated itself on the last 5%.

QUOTED · COMPOSITE

“Agent 3 was built for the single-system case first. Multi-system orchestration — payouts, OAuth chains, live migrations — is the next frontier and it’s where v4 will spend most of its budget. We hear the feedback on cost transparency too. Working on it.”

Synthesized from @amasad‘s public stance on Replit Agent’s roadmap · 2026

Cost vs the alternative

REPLIT AGENT 3

$480 / 19h

4 shipped apps, 2 broken builds, 1 killed

JUNIOR CONTRACTOR

$2,400 / 40h

7 ideas, likely 5-6 shipped, all of them debuggable

The contractor math is rough — $60/hr for a competent junior, 40 hours of work, gets you maybe one more shipped app than Agent 3 and a meaningful amount of human follow-through. So Agent is cheaper. It is also worse at the hard stuff and identical at the easy stuff. The ratio matters less than the friction: Agent ships at 3am on a Saturday and a junior does not.

Where this trajectory points is worth thinking about. Karpathy moving to Anthropic this month signals a research bet on exactly the multi-system orchestration gap Agent 3 keeps falling into. If that bet pays off in the next 12 months, the $480 column above starts to look like a tax on being one model generation early.

$120/app

Effective cost per shipped app

~2h 50m

Median time to working build

Cost overrun on the “system-of-systems” briefs

What Agent 3 is and isn’t, plainly

Is: A very good first-draft author of single-system, well-scoped apps. Cron jobs, dashboards, Chrome extensions, internal tools, anything where you can write the spec in a paragraph.
Isn’t: A reliable shipper of anything involving cross-system contracts (payouts, OAuth dances with multiple providers, migrations against live data). It will try, you will pay for it to try, and the result will be a confident-looking thing that fails its first test.
Doesn’t tell you: When it’s spending unproductive effort. I’d pay double for a “this is going badly, stop me” guardrail. The verified-bill thread on Reddit suggests checkpoints are charged at fixed rates rather than truly effort-based, which makes the marketing positioning slightly misleading.

Reviewers recommend Replit without hesitation for rapid prototyping, learning to code with AI assistance, and collaborative development. However, pricing escalates quickly for professional use, and Agent 3 sometimes requires manual intervention on edge cases.
Hackceleration’s 2026 Replit review — matches my experience exactly

I didn’t test the High Power Model toggle that adds extended thinking. Two reasons: it was already gobbling credits faster than I could account for, and Replit’s docs were unclear at the time I ran the test whether HPM was on by default for Pro. If it was on, my $480 was actually cheap. If it was off, my $480 should have been closer to $240 and I’m being gouged by my own checkbox. I’ll redo idea #2 next month with HPM explicitly toggled, on a fresh account, and post the gap.

What happens next

Replit is in a strange spot. Agent 3 is unambiguously good at one category — single-system app generation — and unambiguously not yet good at the next category — multi-system orchestration. The pricing model bets on the second category coming. If they ship a v4 that actually handles Stripe Connect, OAuth chains, and migrations in a single brief, the $480 stops looking ridiculous and starts looking cheap. If they don’t, the $480 looks like a tax for being early. My bet: v4 lands by AI Engineer SF in June with explicit pricing for the multi-system case, and quietly walks back the “effort-based” framing to a clearer per-action one. If the new framing makes my idea #7 cost $40 instead of $135, I come back. If not, I’m spending the $95/mo on a Hetzner box and Claude Code, which gave me four shipped apps last weekend for $11 in tokens.

That comparison is going to be the next test.

Tricks

I gave Replit Agent 3 seven startup ideas: what shipped, what cost me $480

Seven ideas, eight days, one credit card. Four apps shipped and two embarrassed me. The honest tally on Replit Agent 3 — feature by feature, dollar by dollar.

Holt

Contributor · 2w ago

I gave Replit Agent 3 seven startup ideas: what shipped, what cost me $480

01Fed Replit Agent 3 seven specific startup briefs. 4 shipped a usable app, 2 broke on cross-system contracts, 1 I killed mid-build.
02Total spend: $480.27 in 8 days on a $95/mo Pro plan ($100 monthly credit). Overage: $380.
03Median time to shipped app: 2h 50m. Effective cost per shipped app: $120.
04Agent 3 is excellent on single-system apps and unreliable on anything requiring cross-system orchestration (Stripe Connect, OAuth chains, live migrations).

Listen · narrated by the editor

14:22

Chapters

The total at the bottom of the Replit invoice read $480.27, and my Pro plan was supposed to cover the first $100.

That gap is the story. I gave Replit Agent 3 seven specific startup ideas — the kind of thing I’d ordinarily hand to a junior contractor and a Linear board — and asked it to ship each one as an actual deployed app. Eight days. One credit card. No babysitting beyond clicking “yes, continue.”

Four shipped. Two were unusable. One I killed mid-build because the meter scared me. This is the report, not a benchmark.

4 of 7

Ideas that shipped a usable app

$480.27

Total API + Agent spend in 8 days

$95/mo

Plan I was on (Pro, annual)

The setup

I bought Pro annual at $95/mo. That comes with $100 in monthly credits. Replit switched to a credit + effort-based pricing model back in mid-2025, and the model is a touch opaque — verified bills tend to show fixed rates per checkpoint, not the truly metered “you pay for what the agent thinks about” thing the marketing implied. One Reddit thread I read had a user reporting the same app idea jumping from $0.50 to $3 on the first prompt after the change. I went in expecting variance.

SOURCE

“After the change, the same type of app build went from about $0.50 to nearly $3 for the first prompt alone — overall usage cost increased around six times.”

Replit Review 2026 · Hackceleration · May 2026

Read the full review

Seven ideas, all my own backlog. Not Replit’s gallery, not Twitter trends. Each one I’d actually consider shipping. I wrote each brief in plain English, around 80–120 words. I let Agent 3 pick its own stack. I refused to intervene on file structure or library choice unless a build broke for 3+ retries.

Replit Agent 3 launch press image — Replit’s own press shot for the Agent 3 launch — the product I spent $480 stress-testing across seven briefs. · Photo: Replit Blog

The seven, scored

#	Idea	Shipped?	Time	Cost	Verdict
1	Cron-based RSS-to-Slack summarizer	Yes	32 min	$11.40	Production-ready. Boring win.
2	Two-sided marketplace for dog walkers (Stripe Connect)	Partial	4h 18m	$118.90	UI shipped, payouts broken.
3	AI-powered Twitch overlay generator	Yes	1h 47m	$34.50	Genuinely good. Friend uses it.
4	Whisper + Claude meeting note SaaS	Yes	2h 50m	$67.20	Works. Auth was a mess.
5	iOS Shortcut + web companion for receipts	No	3h 12m	$89.60	Killed it. Spec was too coupled.
6	Browser extension: “explain this regex”	Yes	54 min	$22.90	Shipped to Chrome store in one day.
7	Internal CRM for a 6-person agency	No	6h 04m	$135.77	Crashed on deploy. Three times.

Total time: 19h 37m. Total spend: $480.27. Plan covered $100. Overage: $380.27 — billed at the end of week one, which is when I noticed the credit had vaporized on idea #2 alone.

For a wider lens on where Agent 3 sits in the broader AI-coding field, see my seven-agent shootout from last month — Replit was one of only two that finished. The “finishes” is doing a lot of work in that sentence, as the table above shows.

The four that shipped — what they’re like

1. RSS-to-Slack summarizer

The platonic ideal of an Agent 3 task. One Python script, one Slack webhook, one Replit cron. Agent picked FastAPI, which was overkill, but it ran. I have it summarizing five SaaS-pricing blogs daily into a channel called #price-watch. It’s been live 11 days, zero failures.

3. Twitch overlay generator

Surprised me. I’d assumed Agent would flail on canvas/WebGL stuff. It picked PixiJS, wired Twitch OAuth, and produced a functional overlay configurator in under two hours. My friend who streams Valorant has been using it for a week. He paid me $5. I am now a profitable Replit Agent reseller, technically.

4. Whisper + Claude meeting notes

Worked. The auth flow was a mess — Agent built Supabase auth, then halfway through decided to switch to Clerk, then half-undid it. I had to manually delete the dangling Clerk components. Final product transcribes a 30-min meeting in about 90 seconds and gives back a clean action-item list. I’d use it. Don’t know if I’d pay for it.

6. Regex explainer extension

Honestly impressive. Agent generated the manifest, the popup, the content script, and even a usable icon. I submitted it to the Chrome Web Store the same evening. It cleared review in 18 hours. It’s not making money. It’s making 14 daily users smile, which is something.

Reading this? You'll like this

Investigation3 min · 5 days ago

Sam vs Elon, refiled: the 3 paragraphs in the May filings that actually matter

A nine-person jury killed Musk v. Altman in under two hours. Forget the verdict. The three paragraphs the appeal will hinge on are still on the docket — and one of them quietly redraws OpenAI’s charter for the next ten years.

Holt

The three that didn’t

2. Dog-walker marketplace (the $118 lesson)

This is the “effort-based” pricing in practice. The harder the agent thinks, the more you pay, whether or not the thinking lands. I should have stopped it at the 20-minute mark.

5. Receipt manager (iOS Shortcut + web)

7. Agency CRM (crashed three times)

I gave up and shipped it to a Railway box manually. Took me 22 minutes. So the Agent did 95% of the work and then immolated itself on the last 5%.

QUOTED · COMPOSITE

Synthesized from @amasad‘s public stance on Replit Agent’s roadmap · 2026

Cost vs the alternative

REPLIT AGENT 3

$480 / 19h

4 shipped apps, 2 broken builds, 1 killed

JUNIOR CONTRACTOR

$2,400 / 40h

7 ideas, likely 5-6 shipped, all of them debuggable

$120/app

Effective cost per shipped app

~2h 50m

Median time to working build

Cost overrun on the “system-of-systems” briefs

What Agent 3 is and isn’t, plainly

Is: A very good first-draft author of single-system, well-scoped apps. Cron jobs, dashboards, Chrome extensions, internal tools, anything where you can write the spec in a paragraph.
Isn’t: A reliable shipper of anything involving cross-system contracts (payouts, OAuth dances with multiple providers, migrations against live data). It will try, you will pay for it to try, and the result will be a confident-looking thing that fails its first test.
Doesn’t tell you: When it’s spending unproductive effort. I’d pay double for a “this is going badly, stop me” guardrail. The verified-bill thread on Reddit suggests checkpoints are charged at fixed rates rather than truly effort-based, which makes the marketing positioning slightly misleading.

Reviewers recommend Replit without hesitation for rapid prototyping, learning to code with AI assistance, and collaborative development. However, pricing escalates quickly for professional use, and Agent 3 sometimes requires manual intervention on edge cases.
Hackceleration’s 2026 Replit review — matches my experience exactly

What happens next

That comparison is going to be the next test.

Editor's takeaways

$480.27

Total spend in 8 days on 7 ideas

4 of 7

Briefs that shipped a usable app

$120

Effective cost per shipped app

Cost overrun on multi-system briefs

Agent Frameworks AI Coding Amjad Masad Builder Economics Claude Indie Hacker no-code Replit Replit Agent

Holt

I tested 7 AI coding agents in one afternoon. Only 2 finished.

Same React refactor. Same Sonnet 4.6 endpoint. Claude Code and Cursor one-shot it. Devin spent 87 minutes and broke unre

Sam vs Elon, refiled: the 3 paragraphs in the May filings that actually matter

A nine-person jury killed Musk v. Altman in under two hours. Forget the verdict. The three paragraphs the appeal will hi

Cursor is losing $32 per Pro user. Here is the math.

The July 2025 apology, the Anthropic-only routing, the $20 plan that buys 225 Sonnet calls. The public numbers tell one

AdSense · Display · 728×90 / 320×50

Discussion

0 comments

306

@nikita.eng🏆· 1h ago

This matches the back-of-envelope numbers we ran at our shop two quarters ago. We sized the seat-tax at ~18% of the SaaS market — your 412 is a way better dataset though. Saving this.

@priya.ramanSTAFF· 52m ago

Thanks Nikita. The dataset is on the methodology page; happy to share the public-page scrape if you want to reproduce.

@drmilo· 34m ago

+1 the scraper would be gold.

130

@founderbrain· 2h ago

Hot take: usage-based isn’t winning on merit. It’s winning because procurement can hide it under cloud spend. Different category of budget, same money, no signoff.

@cfo_anon· 1h ago

Confirmed. We literally classify three of our AI tools as "infrastructure" so I don’t need a board memo.

@lin.k· 3h ago

Would love a follow-up on hybrid (seat + usage) — that’s where every B2B horizontal app I look at is landing in 2026.

@ops.maria· 4h ago

Reading this on the train. The chart at the fold deserves a poster on every CRO’s wall. 🤖

I gave Replit Agent 3 seven startup ideas: what shipped, what cost me $480

The setup

The seven, scored

The four that shipped — what they’re like

1. RSS-to-Slack summarizer

3. Twitch overlay generator

4. Whisper + Claude meeting notes

6. Regex explainer extension

On the same beat.

The three that didn’t

2. Dog-walker marketplace (the $118 lesson)

5. Receipt manager (iOS Shortcut + web)

7. Agency CRM (crashed three times)

Cost vs the alternative

What Agent 3 is and isn’t, plainly

The blind spot I owe you

What happens next

0 comments

You might also like

Sam vs Elon, refiled: the 3 paragraphs in the May filings that actually matter

Cursor is losing $32 per Pro user. Here is the math.

12 SaaS Categories That Won’t Survive the AI Compression

I gave Replit Agent 3 seven startup ideas: what shipped, what cost me $480

The setup

The seven, scored

The four that shipped — what they’re like

1. RSS-to-Slack summarizer

3. Twitch overlay generator

4. Whisper + Claude meeting notes

6. Regex explainer extension

Sam vs Elon, refiled: the 3 paragraphs in the May filings that actually matter

The three that didn’t

2. Dog-walker marketplace (the $118 lesson)

5. Receipt manager (iOS Shortcut + web)

7. Agency CRM (crashed three times)

Cost vs the alternative

What Agent 3 is and isn’t, plainly

The blind spot I owe you

What happens next

You might also like

I tested 7 AI coding agents in one afternoon. Only 2 finished.

Sam vs Elon, refiled: the 3 paragraphs in the May filings that actually matter

Cursor is losing $32 per Pro user. Here is the math.

0 comments