Top 10 AI Coding Tools

AI coding tools are evolving fast. From Cursor's economics to Replit's agent, we test which ones actually ship. This category tracks the tools that are reshaping how we write code.

By Holt · synthesized from 3 sources

Cursor is bleeding $32 per Pro user every month, and the math is not a secret — it’s sitting in plain sight on Anthropic’s pricing page and in Cursor’s own apology post from July 2025. That’s the number that defines this moment in AI coding tools: the unit economics of selling “unlimited Sonnet” for $20 simply don’t work when a single Sonnet turn costs eleven cents and power users run 200–600 turns a day. Meanwhile, I gave seven AI coding agents the same refactoring task and only two shipped working code on the first try — Claude Code and Cursor (forced Sonnet) passed, while the rest failed or needed human fixes. The trend isn’t that these tools are getting better; it’s that the gap between what they promise and what they deliver is being measured in real dollars and real afternoons.

The through-line across these three investigations is that inference cost is the hidden governor of every AI coding product, and nobody has solved it. Cursor’s Cursor is losing $32 per Pro user. Here is the math. shows a SaaS business masquerading as a reseller — its gross margin is determined by whatever Anthropic publishes next quarter, not by its own engineering. Replit’s I gave Replit Agent 3 seven startup ideas: what shipped, what cost me $480 tells the same story from the user side: a $95 Pro plan with $100 in monthly credits evaporated on one idea (a dog-walker marketplace that shipped broken payouts) and the final bill hit $480. Even the agents that finished in I tested 7 AI coding agents in one afternoon. Only 2 finished. — Claude Code at $2.31 and Cursor at roughly $11 for the full test — were only viable because I brought my own model endpoint. The moment you rely on the tool’s built-in model routing, you’re playing a game where the house always wins.

Where these articles agree is that the “agent” label is doing heroic work. Cursor’s Auto router picks the cheapest model that can still finish the task, which is how they market “unlimited” while actually shipping GPT-4o-mini-tier responses for most requests. Replit’s Agent 3 shipped four of seven ideas, but the two that “shipped” — the Whisper + Claude meeting notes app and the Twitch overlay generator — came with broken auth flows and dangling dependencies that a human had to clean up. The agent test’s only clean pass was Claude Code, which is essentially Anthropic’s own CLI wrapper, not a third-party product. The disagreement is subtle: Cursor and Replit are betting that users will tolerate noise and overage charges for speed, while Claude Code and Codex CLI (which got a “Partial” for surfacing an untested code path) are betting that users want correctness and honesty first. The market hasn’t picked a winner yet, but the numbers say speed without margin is a Ponzi scheme.

The contrarian angle is that these articles are too kind to the tools that “finished.” Cursor’s agent mode passed my refactoring test, but it renamed a prop without asking and pulled in twelve files where four were needed — that’s not a win, that’s a chatty intern who wastes review time. Replit’s four “shipped” apps included one that required manual deletion of dangling Clerk components and another where the auth flow was so tangled it took longer to untangle than building from scratch. The real missing insight is that AI coding tools are excellent at generating the illusion of progress — they produce a diff that compiles, a UI that renders, a bill that surprises. But the cost of verifying and fixing their output is rarely counted in the marketing. Cursor’s apology post admitted the $20 plan was always a tripwire, but nobody is talking about the tripwire for the user: the hours spent diagnosing a bad diff, the $380 overage on a single weekend, the four agents that shipped code that broke tests in ways that took longer to fix than rewriting.

If you only read one, make it Cursor is losing $32 per Pro user. Here is the math. because it exposes the foundational lie of the entire category: these aren’t SaaS products, they’re inference resellers pretending to have margins. The agent test and the Replit experiment are symptoms of the same disease — when your COGS is dictated by a third-party rate card, every feature is a loss leader, and every “pro” user is a liability. Until a tool builds its own model or negotiates a deal that makes the unit economics work, the smart money is on bringing your own API key and treating every agent like a contractor you pay per task, not a subscription you pray will pay for itself.

Go in-depth

Top 10 AI Coding Tools

AI coding tools are evolving fast. From Cursor's economics to Replit's agent, we test which ones actually ship. This category tracks the tools that are reshaping how we write code.

By Holt · synthesized from 3 sources

Go in-depth