ChatGPT has token limits, but they're hidden behind the chat interface and confused by free vs Plus differences. Here is exactly what the limits are, why ChatGPT sometimes forgets earlier messages, and how to work within the limits.
| Tier | Model | Context window | Approximate words |
|---|---|---|---|
| Free | GPT-4o mini | ~8K-32K | ~6,000-24,000 |
| Free | GPT-4o (limited) | ~32K-128K | ~24,000-96,000 |
| Plus | GPT-4o | 128K | ~96,000 |
| Plus | GPT-4.1 | 128K | ~96,000 |
| Plus | o3 / o4 mini | 128K | ~96,000 |
| Pro | GPT-4o, o3 | 128K | ~96,000 |
| API direct | GPT-4o | 128K | ~96,000 |
OpenAI changes the limits for free users periodically based on cost and capacity. As of April 2026, free ChatGPT users get smaller context windows than Plus users on the same model — even though the underlying model technically supports more.
The context window is the total tokens ChatGPT can "see" at once. It includes:
Once the total exceeds the window, ChatGPT starts dropping the oldest content. You'll notice this when ChatGPT suddenly "forgets" something you said earlier in a long conversation.
Count tokens for your ChatGPT messages.
Open Token Counter →Signs ChatGPT is dropping context:
If you see any of these, you've exceeded the effective context window. Time to start a new chat or summarize the important parts yourself.
For ChatGPT Plus users with the 128K context:
| Content | Approximate tokens | Fits? |
|---|---|---|
| Short conversation (10 messages) | 3,000 | Yes (4% of window) |
| Long conversation (100 messages) | 30,000 | Yes (23% of window) |
| Very long chat (500 messages) | 150,000 | No, will start dropping |
| Single 1-page document | 350 | Yes |
| 10-page report | 3,500 | Yes |
| 100-page report | 35,000 | Yes (27% of window) |
| 500-page document | 175,000 | No, exceeds window |
| Full novel (80K words) | 105,000 | Yes (82% of window, tight) |
For most users, 128K is more than enough. The main case where it's a constraint is very long conversations spanning hundreds of messages, or uploading very large documents.
1. Start a new chat with a summary. Ask ChatGPT to summarize the current conversation. Copy the summary. Start a new chat and paste the summary as context.
2. Upload key info as a file instead of pasting. Files in ChatGPT use less context window space than pasted text in some cases.
3. Use a custom GPT with persistent instructions. Custom GPT instructions aren't subject to the same drop-out as conversation history.
4. Switch to API access. The OpenAI API gives you the full 128K context and lets you manage history precisely. More work but more control.
5. Use a different model with a larger window. Claude (200K) or Gemini (1M+) handle longer documents better than ChatGPT.
If you're pasting a long document into ChatGPT and want to know if it will fit:
For very long documents, consider a model with a larger window (Claude or Gemini) instead of fighting with ChatGPT's limits.
The OpenAI API gives you the full advertised context window (128K for GPT-4o). ChatGPT's free tier sometimes uses a smaller effective window because of cost management. If you need maximum context, use the API directly — it's cheaper per token at typical use, and you control the full window.
For $20/month, ChatGPT Plus gives:
For users who run into token limits regularly, Plus is worth it. For casual use, the free tier is fine for most conversations.
Check token counts before pasting into ChatGPT.
Open Token Counter →