JSON vs XML for LLM Prompts — Which Format Wins in 2026?
- Claude (Anthropic) was trained with XML-style tags and responds best to them — 5-15% accuracy lift on structured tasks.
- GPT (OpenAI) prefers JSON for structured output; function calling returns JSON natively.
- Gemini accepts both; format choice matters less than clear structure and clean examples.
Table of Contents
The best prompt format depends on the model. Claude was trained with XML-like tags and extracts them reliably. GPT-4 and GPT-5 were trained with JSON-heavy examples and return JSON natively through function calling. Gemini works with either. This guide covers which format to use per model, token cost differences, and when mixing the two is the right call. Plus a free converter to flip between them.
Why Format Matters at All
LLMs are trained on specific patterns. If a model saw thousands of prompts with <example> and <rules> tags during training, it will parse those tags more reliably than arbitrary Markdown or JSON structures. Anthropic published a guide recommending XML tags specifically because Claude was trained on that pattern.
Accuracy lift from format choice can move a task 5-15 percentage points on benchmarks. For production systems, that's the difference between "works" and "works well enough to ship."
Claude — Use XML Tags
Anthropic recommends XML tags for Claude prompts. Specifically:
- Wrap examples in
<example>tags. - Separate instructions from inputs with
<instructions>and<document>. - Use
<thinking>tags to elicit reasoning. - Use
<answer>tags to constrain output.
Example:
<instructions>Classify the sentiment of each review.</instructions>
<reviews>
<review id="1">The service was great.</review>
<review id="2">Never again.</review>
</reviews>
<output_format>JSON array of {id, sentiment}</output_format>
The tags don't need to be schema-valid XML. They just need to be consistent. Claude picks up the structure reliably.
GPT — JSON Native, Function Calling Wins
GPT-4 and GPT-5 handle both formats, but function calling (structured output mode) returns JSON natively and is the cleanest path for any structured task. When you need machine-parseable output from GPT, define a JSON schema and call the model with response_format: {type: "json_object"} or via tool use.
For prompt context, GPT works well with:
- Markdown headers for sections (
## Instructions,## Examples). - JSON for structured input data.
- Numbered lists for multi-step instructions.
XML tags work too, but are rarer in GPT's training distribution. If you're switching a prompt from Claude to GPT, you can usually leave the XML tags in place — GPT just treats them as prose boundaries.
Sell Custom Apparel — We Handle Printing & Free ShippingGemini — Format-Agnostic, Content First
Gemini handles XML and JSON with similar accuracy on most tasks. Format choice matters less than:
- Clean, un-ambiguous instructions.
- Concrete examples.
- Explicit output format description.
If your team uses both Claude and Gemini, pick XML tags — it works equally well on Gemini and strictly better on Claude. If you use GPT and Gemini, pick JSON for the GPT compatibility win.
Token Costs — JSON Is Usually Cheaper
XML has longer closing tags (</example>) than JSON curly braces. Across a long prompt with many examples, JSON is typically 5-15% fewer tokens than equivalent XML.
For very long, example-heavy prompts at scale, that token delta compounds into real API cost. Offsetting this: Claude with XML may need fewer re-rolls due to better structure recognition, so the cost-per-successful-call can come out even.
Rule of thumb: for single prompts, pick the format the model prefers. For high-volume automation, measure both on real traffic.
Mixing the Two — A Hybrid Pattern That Works
A common production pattern: XML tags for prompt structure, JSON for structured input and output.
<instructions>Extract entities and relationships.</instructions>
<input>
{"text": "Acme acquired Zenith in 2024 for $50M."}
</input>
<output_format>
{"entities": [...], "relationships": [...]}
</output_format>
Best of both worlds — the XML tags guide the model to each section, and the JSON blocks give the structured shape the downstream code expects. Works on Claude, GPT, and Gemini. If you need to convert between the two formats in your prompt engineering workflow, use our JSON to XML converter.
Switching Prompt Formats? Convert in Seconds
Paste your JSON-structured prompt, get XML-tagged output ready for Claude. No signup.
Open Free JSON to XML ConverterFrequently Asked Questions
Does Claude really perform better with XML tags than JSON?
On structured extraction and multi-step reasoning tasks, yes — measured improvements of 5-15 percentage points in Anthropic's own benchmarks. For simple Q&A, the difference is usually within noise. Use XML tags for anything involving multiple distinct prompt sections.
Should I use XML tags in GPT prompts?
They work fine — GPT treats them as prose boundaries. But Markdown headers or JSON structure are more idiomatic for GPT. If you maintain one prompt for multiple models, XML tags are a reasonable common denominator.
What about system prompts — any different?
System prompts benefit from structure even more than user prompts because they set ground rules. Claude's Anthropic-recommended system prompt pattern uses heavy XML tagging. GPT system prompts are usually shorter and less structured.
How do I convert a Claude XML prompt to GPT JSON structure?
Use our JSON to XML converter in reverse — or just rewrite the top-level sections. Claude's

