How to Write an AI System Prompt That Actually Works

Last updated: April 2026 9 min read

What a system prompt actually does
The five sections
Identity: who is the AI
Capabilities: what it can do
Rules and behavior toggles
Constraints: the hard nos
Output format and length
How long should it be
Frequently Asked Questions

System prompts are the foundation of every AI product worth shipping. They are the first message your model sees on every conversation, the place where you define who the assistant is, what it can do, what it must never do, and how it should respond. Get the system prompt right and the model behaves; get it wrong and you spend the next month patching symptoms.

This guide walks through every section of a production-grade system prompt with examples you can copy. If you want to skip the manual assembly, the free system prompt generator builds the same structure from a use case picker and a few toggles.

What a System Prompt Actually Does

A system prompt sits at the top of an API request and tells the model two things: who you are and what the rules are. It is read before any user message and is treated with higher priority by every major model — OpenAI calls it the "system" role, Anthropic calls it the "system" parameter, but the concept is identical across providers.

The model uses your system prompt to filter every subsequent response. If you say "you are a paralegal assistant," it will refuse to write Python code, even if a user asks. If you say "always cite sources," it will start refusing to make claims it cannot back up. If you say "be concise," it will trim padding from every answer. The system prompt is your behavioral knob, and it is the cheapest one you have.

The cost of a system prompt is its token count, which counts toward your context window and your bill on every request. Most production prompts are 100 to 500 tokens. Use the token counter to measure yours before shipping.

The Five Sections of Every Good System Prompt

Effective system prompts follow a consistent structure: identity, capabilities, rules, constraints, and output format. Skip any of these and the model has to guess.

Identity — Who the assistant is in one sentence. "You are Ada, a customer support assistant for Acme Corp."
Capabilities — What it can do. "You answer questions about products, pricing, refunds, and account settings."
Rules — Behavioral guardrails. "Always admit when you don't know something. Never invent product features."
Constraints — Hard nos. "Never share competitor pricing. Never promise refunds outside of the documented policy."
Output format — How to respond. "Reply in plain English, under 100 words unless the user asks for detail. Use bullet points for lists."

The free system prompt generator lays these sections out for you with togglable rules, so you do not have to remember the structure every time you start a new project.

Identity: Who Is the AI?

The identity sentence anchors the entire conversation. The model uses it as a north star — when a user goes off topic, the model uses the identity to decide whether to engage or redirect. A vague identity ("you are a helpful assistant") leads to vague behavior. A specific identity ("you are Max, a senior DevOps engineer who specializes in Kubernetes troubleshooting") produces sharp, on-topic answers.

If your product is a customer-facing chatbot, give the assistant a name. Users prefer talking to "Max" over "the AI assistant" — it lowers the abstraction barrier and makes the bot feel less like a black box. If your product is internal or API-only, the name matters less, but the role still matters a lot.

One useful trick: include the company or product name in the identity. "You are a coding assistant for Stripe API integrations" focuses the model on the right domain even when the user's question is ambiguous.

Capabilities: What It Can Do (and What It Cannot)

The capabilities section tells the model the boundaries of its job. List what it CAN do explicitly. Listing what it cannot do is also useful, but a strong capability list usually makes the cannot list shorter.

Example for a customer support bot: "You answer questions about Acme product features, pricing tiers, billing, account settings, and how to contact human support. You can troubleshoot common issues using our knowledge base. You cannot process refunds — you must direct refund requests to human support."

Notice how specific that is. "Process refunds" is not in the capability list, so the model will refuse if asked. You did not have to say "do not process refunds" — the absence is the constraint. This is faster and uses fewer tokens than a wall of negative rules.

Rules and Behavior Toggles

Rules are the third layer. These are the toggles you can flip on and off depending on your application. Common rules:

Stay on topic — refuse off-topic questions politely
Admit unknowns — say "I don't know" instead of inventing answers
Cite sources — reference where claims come from
Be concise — short answers by default, long only if asked
Ask clarifying questions — when a request is ambiguous, ask first
Use structured output — bullet points, headings, code blocks where appropriate
No personal opinions — facts only, no subjective judgments
Friendly or formal tone — pick one consciously

The free system prompt generator ships with all of these as one-click toggles, so you can mix and match for any project.

Constraints: The Hard Nos

Constraints are rules with teeth. They are not preferences — they are hard limits the model must never cross. Use them sparingly because every constraint adds tokens and risk of confusion. Save them for things that would actually hurt your business or users if violated.

Examples of legitimate hard constraints:

"Never share competitor pricing" — sales bots
"Never give medical advice — always recommend consulting a doctor" — health apps
"Never claim to be human" — any conversational AI (legally required in some jurisdictions)
"Never store or repeat personal information from prior messages" — privacy-sensitive apps

If you find yourself writing more than five constraints, you may be using the system prompt to patch a deeper product problem. Step back and ask whether the model needs different training or whether the user flow needs to change.

Output Format and Length

The fifth section tells the model what shape responses should take. This is where you say "respond in markdown," "respond in JSON only," "use bullet points for lists," "keep responses under 100 words," "respond in the same language as the user."

If your application parses model output programmatically (e.g., extracting JSON to display in a UI), be brutally explicit about format. Show an example. The model is far more reliable when you give it a template to fill in.

Example: "Respond ONLY with valid JSON in this format: { \"answer\": string, \"confidence\": number, \"sources\": string[] }. Do not include any text outside the JSON object."

How Long Should a System Prompt Be?

Most effective system prompts are between 100 and 500 tokens. That is roughly 75 to 375 words. Shorter than 100 tokens and you probably have not given the model enough to work with. Longer than 500 and you start to see diminishing returns — the model begins to forget or deprioritize earlier rules.

A 1,000-token prompt is acceptable for complex applications with many constraints, but you are paying for those tokens on every API call. If you serve 10,000 conversations a day with a 1,000-token prompt at GPT-4o pricing, that is $25 to $50 a day in system prompt tokens alone before any user input. The AI cost calculator can show you the math for your specific model and volume.

Generate Your System Prompt in 30 Seconds

Pick a use case, toggle rules, copy the result. Free, no signup, browser-based.

Open System Prompt Generator

Frequently Asked Questions

What is a system prompt?

A system prompt is the first message in an AI conversation that defines the assistant's identity, capabilities, rules, and output format. It is sent with every API request and shapes how the model responds to all subsequent user messages.

How long should a system prompt be?

Most production system prompts are 100-500 tokens (75-375 words). Shorter prompts under-specify behavior; longer prompts (over 1000 tokens) cause the model to deprioritize earlier rules and burn budget on every request.

Do all AI models support system prompts?

Yes. OpenAI (GPT-4, GPT-4o), Anthropic (Claude), Google (Gemini), Mistral, and Meta (Llama) all support system prompts via their API. The parameter name varies — OpenAI uses the system role in the messages array, Anthropic uses a top-level system parameter.

Should I use a system prompt or user message instructions?

Use system prompts for persistent behavior that should apply to every message (role, rules, output format). Use user messages for task-specific instructions that change request to request. System prompts also get cached by some providers, reducing cost.