What is the ChatGPT token limit?

For ChatGPT free (powered by GPT-4o or GPT-4o mini): the conversation context is limited to roughly 8K-32K tokens depending on the model and OpenAI policy at the time. For ChatGPT Plus on GPT-4o: 128K tokens of context window. For the GPT-4 API directly: 128K tokens.

Why does ChatGPT forget what I said earlier?

When the conversation exceeds the model context window, ChatGPT drops older messages to make room for new ones. The model literally cannot see what you said earlier. The fix is to start a new chat or summarize the important context yourself.

How many words is 128K tokens?

About 96,000 words, or roughly 200 standard book pages. That is enough for very long conversations or extensive documents. ChatGPT Plus on GPT-4o can hold a small novel in its context window.

How do I check how many tokens my ChatGPT message uses?

Use a free token counter. Paste your message and you will see the exact token count plus estimated API cost across all GPT models. Useful for sizing prompts before pasting them into ChatGPT or before sending them through the API.

ChatGPT Token Limit Explained — Context Window, Free vs Plus

Last updated: April 20266 min readAI Tools

ChatGPT has token limits, but they're hidden behind the chat interface and confused by free vs Plus differences. Here is exactly what the limits are, why ChatGPT sometimes forgets earlier messages, and how to work within the limits.

ChatGPT Token Limits by Tier

Tier	Model	Context window	Approximate words
Free	GPT-4o mini	~8K-32K	~6,000-24,000
Free	GPT-4o (limited)	~32K-128K	~24,000-96,000
Plus	GPT-4o	128K	~96,000
Plus	GPT-4.1	128K	~96,000
Plus	o3 / o4 mini	128K	~96,000
Pro	GPT-4o, o3	128K	~96,000
API direct	GPT-4o	128K	~96,000

OpenAI changes the limits for free users periodically based on cost and capacity. As of April 2026, free ChatGPT users get smaller context windows than Plus users on the same model — even though the underlying model technically supports more.

What "Context Window" Means in ChatGPT

The context window is the total tokens ChatGPT can "see" at once. It includes:

The hidden system prompt OpenAI sets (~500-1,000 tokens)
All your messages in the current conversation
All ChatGPT's responses in the current conversation
Any files you uploaded to the conversation
Custom GPT instructions if you're in a custom GPT

Once the total exceeds the window, ChatGPT starts dropping the oldest content. You'll notice this when ChatGPT suddenly "forgets" something you said earlier in a long conversation.

Count tokens for your ChatGPT messages.

Open Token Counter →

How to Tell if You're Hitting the Limit

Signs ChatGPT is dropping context:

It "forgets" details you mentioned earlier in the conversation
It refers to things as if you'd just brought them up when you discussed them many messages ago
It loses track of personas or styles you set at the start
It contradicts something it said earlier
It asks for information you already provided

If you see any of these, you've exceeded the effective context window. Time to start a new chat or summarize the important parts yourself.

How Much You Can Fit in 128K Tokens

For ChatGPT Plus users with the 128K context:

Content	Approximate tokens	Fits?
Short conversation (10 messages)	3,000	Yes (4% of window)
Long conversation (100 messages)	30,000	Yes (23% of window)
Very long chat (500 messages)	150,000	No, will start dropping
Single 1-page document	350	Yes
10-page report	3,500	Yes
100-page report	35,000	Yes (27% of window)
500-page document	175,000	No, exceeds window
Full novel (80K words)	105,000	Yes (82% of window, tight)

For most users, 128K is more than enough. The main case where it's a constraint is very long conversations spanning hundreds of messages, or uploading very large documents.

Workarounds When You Hit the Limit

1. Start a new chat with a summary. Ask ChatGPT to summarize the current conversation. Copy the summary. Start a new chat and paste the summary as context.

2. Upload key info as a file instead of pasting. Files in ChatGPT use less context window space than pasted text in some cases.

3. Use a custom GPT with persistent instructions. Custom GPT instructions aren't subject to the same drop-out as conversation history.

4. Switch to API access. The OpenAI API gives you the full 128K context and lets you manage history precisely. More work but more control.

5. Use a different model with a larger window. Claude (200K) or Gemini (1M+) handle longer documents better than ChatGPT.

Counting Tokens Before Sending to ChatGPT

If you're pasting a long document into ChatGPT and want to know if it will fit:

Open the Token Counter
Paste the document
Check the count against your tier's window (8K-32K free, 128K Plus)
If over, trim or summarize before pasting

For very long documents, consider a model with a larger window (Claude or Gemini) instead of fighting with ChatGPT's limits.

API Token Limits vs ChatGPT Token Limits

The OpenAI API gives you the full advertised context window (128K for GPT-4o). ChatGPT's free tier sometimes uses a smaller effective window because of cost management. If you need maximum context, use the API directly — it's cheaper per token at typical use, and you control the full window.

What ChatGPT Plus Actually Buys You

For $20/month, ChatGPT Plus gives:

Larger effective context window (close to the full 128K)
Access to GPT-4o and GPT-4.1 (vs free tier limited mostly to GPT-4o mini)
Higher message rate limits
Priority access during peak hours
Access to advanced features (DALL-E, voice, custom GPTs, browsing)

For users who run into token limits regularly, Plus is worth it. For casual use, the free tier is fine for most conversations.

Check token counts before pasting into ChatGPT.