Blog
Custom Print on Demand Apparel — Free Storefront for Your Business
Wild & Free Tools

Robots.txt Generator — Complete Guide to Syntax, Rules & Examples

Last updated: April 20268 min readSEO Tools

Robots.txt is the first file search engine crawlers look for when they visit your site. It controls what gets crawled and what does not. A misconfigured robots.txt can block your entire site from Google. A well-crafted one saves crawl budget and keeps private directories out of search results.

How Robots.txt Works

When any crawler — Googlebot, Bingbot, or others — arrives at your site, it first requests yoursite.com/robots.txt. If the file exists, the crawler reads the rules before crawling anything else. If the file does not exist, the crawler assumes everything is fair game and crawls freely.

The file lives at your domain root. Not in a subfolder. Not with a different name. Exactly at /robots.txt.

Complete Syntax Reference

DirectivePurposeExample
User-agentSpecifies which crawler the rules apply toUser-agent: Googlebot
DisallowBlocks a path from being crawledDisallow: /admin/
AllowOverrides a Disallow for a specific pathAllow: /admin/public/
SitemapPoints crawlers to your XML sitemapSitemap: https://example.com/sitemap.xml
Crawl-delaySeconds between requests (ignored by Google)Crawl-delay: 10
* (wildcard)Matches any character sequence in a URLDisallow: /*.pdf$
$ (end match)Matches the end of a URLDisallow: /*.json$
# (comment)Adds a human-readable note# Block staging pages

10 Common Robots.txt Patterns

PatternRulesWhat It Does
Allow everythingUser-agent: *\nAllow: /All crawlers can access all pages — the most open configuration
Block everythingUser-agent: *\nDisallow: /No crawler can access any page — useful for staging sites
Block one directoryUser-agent: *\nDisallow: /private/Blocks the /private/ directory from all crawlers
Block multiple directoriesUser-agent: *\nDisallow: /admin/\nDisallow: /tmp/\nDisallow: /staging/Blocks three directories from all crawlers
Allow only GooglebotUser-agent: Googlebot\nAllow: /\nUser-agent: *\nDisallow: /Only Google can crawl; everyone else is blocked
Block images directoryUser-agent: *\nDisallow: /images/Prevents crawlers from indexing your image directory
Block PDF filesUser-agent: *\nDisallow: /*.pdf$Blocks all files ending in .pdf from being crawled
Block query parametersUser-agent: *\nDisallow: /*?*Blocks URLs with query strings (filters, sorts, session IDs)
Add sitemap referenceSitemap: https://example.com/sitemap.xmlTells all crawlers where your sitemap lives — place at the bottom
Slow down crawlingUser-agent: *\nCrawl-delay: 5Asks bots to wait 5 seconds between requests (Google ignores this)

Robots.txt Generator Comparison

FeatureWildandFree GeneratorGoogle Robots TesterSmallSEOToolsSEOptimer
Generate robots.txt✓ Full generator with all directives✗ Testing only, no generation✓ Basic generator✓ Basic generator
Test existing file✓ Preview output✓ Excellent — official tool✗ No testing~Basic validation
AI bot rules✓ Includes GPTBot, CCBot, etc.✗ Not covered✗ Not covered✗ Not covered
Wildcard support✓ Full pattern matching✓ Shows wildcard results~Limited~Limited
Custom directives✓ Crawl-delay, multiple user-agents✗ Read-only testing~Some options~Some options
Export/download✓ One-click download✗ Not applicable✓ Copy text✓ Copy text
No account required✓ Free, no signup✗ Requires Search Console access✓ Free~Free with limits
Privacy✓ Runs in your browser✓ Google servers~Ad-supported, data collected~Ad-supported, data collected

Critical Mistakes to Avoid

Pair With These SEO Tools

Generate a valid robots.txt file in seconds — no syntax memorization required.

Open Robots.txt Generator
Launch Your Own Clothing Brand — No Inventory, No Risk