Indie hackers get crushed by surprise AI bills. Month one looks fine. By month three you have a few hundred users, your OpenAI dashboard hits $400, and you realize your $9/month subscription does not cover it. Here is how to size the bill before you ship — and which models actually let an indie project break even.
For most ChatGPT wrappers built by solo devs, the cost per active user breaks down like this:
| User type | Messages/day | Monthly token cost (GPT-4o mini) | Monthly token cost (GPT-4o) |
|---|---|---|---|
| Trial / free | 3 | $0.04 | $0.66 |
| Light paid | 10 | $0.13 | $2.20 |
| Average paid | 25 | $0.32 | $5.50 |
| Heavy power user | 75 | $0.97 | $16.50 |
| Whale (do not let this happen) | 300 | $3.87 | $66.00 |
Assumes 1,000 input tokens (with chat history) and 300 output tokens per message.
The "whale" row is what kills indie projects. One user who chats nonstop on GPT-4o can cost more than 10 paying customers' subscription revenue. Always rate-limit. Always.
Pretend you launch with 50 active users in month one. Half are on free tier (5 messages/day max), half are paying $9/mo:
Cost on each model:
| Model | Monthly token cost | Revenue ($9 × 25 paid) | Margin |
|---|---|---|---|
| Gemini 2.0 Flash | $4.95 | $225 | 98% |
| GPT-4o mini | $7.43 | $225 | 97% |
| Claude Haiku 3.5 | $40.50 | $225 | 82% |
| Gemini 2.5 Pro | $95.63 | $225 | 58% |
| GPT-4o | $120.00 | $225 | 47% |
| Claude Sonnet 4 | $225.00 | $225 | 0% — break even |
| Claude Opus 4 | $1,125.00 | $225 | LOSS — 5x revenue |
For an indie MVP, the cheap tier is the only sane choice. Anything above Claude Haiku eats your margin. Anything above Gemini 2.5 Pro turns a $9 subscription into a money loser at scale.
Get the exact margin for your usage shape and price point.
Open AI Cost Calculator →1. Cache the system prompt. Both Anthropic and OpenAI offer prompt caching. If your system prompt is 500 tokens and never changes, caching cuts that cost by 50-90% per request after the first.
2. Truncate chat history. Most conversations only need the last 5-10 turns. Drop everything else. Or summarize old turns into a single 100-token summary.
3. Set max_tokens aggressively. If you do not need more than 300 output tokens, set the cap. Cheap models will sometimes write essays if you let them.
4. Use the cheapest model that works. Test GPT-4o mini first. Only upgrade specific prompts (the ones quality fails on) to a more expensive model. Most prompts do not need premium.
5. Rate-limit hard. Free tier: 5-10 messages/day. Paid tier: 50-100/day soft limit, with a "fair use" hard cap at 200/day. Communicate this clearly. Your whales are your enemies.
6. Use batch APIs for non-real-time work. If you have any background processing (summarization, classification, embeddings), submit as a batch job for a 50% discount.
A. Free with usage cap, paid for unlimited. Free tier covers 80% of users, paid tier covers the heavy users. Common ratio: free at 5 msg/day, paid at $9-19/mo for 100-500 msg/day. Works because heavy users self-select into paying.
B. Bring your own key (BYOK). Charge a fixed subscription ($5-15/mo) for the UI and features, but users supply their own API key. Eliminates token cost from your P&L entirely. Trade-off: signup friction is higher.
C. Credit-based. Sell packs of credits ($10 = 200 messages). Direct pass-through with markup. Simpler economics but worse retention than subscriptions.
Before you push to production, run these numbers through the AI Cost Calculator:
If the worst-case "whale" is more than 3x your subscription price, your rate limits are too loose. Tighten them or switch to a cheaper model.
Project your MVP bill at 100, 1,000, and 10,000 users — across every model.
Open AI Cost Calculator →Start on Gemini 2.0 Flash or GPT-4o mini. Cap free tier at 5 messages/day. Cache system prompts. Truncate chat history beyond 10 turns. Charge $9-19/month. Don't let any user hit more than 200 messages per day.
Run the numbers in the calculator. If the worst case still profits, ship.