Blog
Custom Print on Demand Apparel — Free Storefront for Your Business
Wild & Free Tools

Robots.txt Guide — How to Create & Optimize Your Robots.txt File

Last updated: March 2026 12 min read SEO Tools

Your robots.txt file is one of the first things search engine crawlers look for when they visit your site. A single wrong rule can accidentally hide your entire website from Google. A well-configured robots.txt, on the other hand, keeps crawlers focused on your important pages and away from duplicate content, admin panels, and internal search results.

This guide covers everything from basic syntax to advanced directives. Whether you are setting up your first robots.txt or auditing an existing one, you will learn exactly what each rule does, how to avoid the most dangerous mistakes, and how to generate a clean robots.txt file in seconds with our free robots.txt generator.

What Is Robots.txt?

Robots.txt is a plain text file that lives at the root of your website — https://yoursite.com/robots.txt. It uses the Robots Exclusion Protocol to communicate with search engine crawlers (Googlebot, Bingbot, etc.) about which parts of your site they should and should not access.

Important clarification: robots.txt controls crawling, not indexing. Blocking a URL in robots.txt prevents crawlers from visiting the page, but Google can still index the URL if other sites link to it. The indexed result will just show the URL with no description. To fully prevent indexing, use a noindex meta tag on the page itself.

Robots.txt Syntax Explained

Every robots.txt file uses the same simple structure: one or more groups of rules, each starting with a User-agent line.

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /admin/public-page/

User-agent: Googlebot
Disallow: /tmp/

Sitemap: https://yoursite.com/sitemap.xml

Key Directives

Wildcards

Google and Bing support two wildcard characters:

Common Rules You Should Know

Block All Crawlers From Everything

User-agent: *
Disallow: /

This is the nuclear option. Only use this for staging sites, development environments, or sites that are genuinely private. This single rule will remove your entire site from search results.

Allow All Crawlers Everywhere

User-agent: *
Disallow:

An empty Disallow directive means "allow everything." This is effectively the same as not having a robots.txt file, but it explicitly signals to crawlers that you have considered access control.

Block Admin and Login Pages

User-agent: *
Disallow: /admin/
Disallow: /wp-admin/
Disallow: /login/
Disallow: /cart/
Disallow: /checkout/

These pages have no SEO value and waste crawl budget. Block them so crawlers spend their time on your content pages instead.

Block Internal Search Results

User-agent: *
Disallow: /search
Disallow: /?s=
Disallow: /*?q=

Internal search result pages create near-infinite thin content URLs. Google specifically warns against letting these get indexed.

Sell Custom Apparel — We Handle Printing & Free Shipping

Crawl-Delay Directive

Crawl-delay tells crawlers to wait a certain number of seconds between requests:

User-agent: *
Crawl-delay: 10

This asks crawlers to wait 10 seconds between each page request. This is useful for small servers that cannot handle aggressive crawling.

Important: Google ignores Crawl-delay entirely. To control Googlebot's crawl rate, use Google Search Console's crawl rate settings. Bing, Yandex, and some other crawlers do respect this directive.

Sitemap Directive

Adding a Sitemap directive to your robots.txt helps crawlers discover your XML sitemap, even if they have not found it through other means:

Sitemap: https://yoursite.com/sitemap.xml
Sitemap: https://yoursite.com/sitemap-posts.xml
Sitemap: https://yoursite.com/sitemap-pages.xml

You can list multiple sitemaps. Use the full absolute URL including the protocol. This directive can appear anywhere in the file — it does not need to be inside a User-agent group.

Testing Your Robots.txt

Always test before deploying. A single typo can block Google from your entire site.

  1. Google Search Console — go to Settings > Crawling > robots.txt to test specific URLs against your rules
  2. Manual check — visit yoursite.com/robots.txt in a browser and read through every rule
  3. Bing Webmaster Tools — has its own robots.txt analyzer
  4. Our generator — the free robots.txt generator shows a live preview of your rules before you copy the file

Dangerous Mistakes to Avoid

  1. Accidentally blocking your entire site — a misplaced Disallow: / under User-agent: * is the most common catastrophic error
  2. Using robots.txt for security — the file is public. Never put sensitive paths in it thinking they will be hidden. Anyone can read your robots.txt.
  3. Blocking CSS and JavaScript — Google needs to render your pages. Blocking CSS/JS files prevents proper rendering and hurts mobile-first indexing.
  4. Putting robots.txt in the wrong location — it must be at the domain root, not in a subdirectory
  5. Conflicting Disallow and Allow rules — when rules conflict, Google uses the most specific match. But it is easy to create confusion. Keep rules simple and test thoroughly.
  6. Forgetting subdomainswww.yoursite.com and blog.yoursite.com are different hosts. Each needs its own robots.txt.

Generate a clean robots.txt file in seconds — just toggle the rules you need.

Open Robots.txt Generator

Generate Your Robots.txt

Our free robots.txt generator lets you build a robots.txt file without memorizing syntax. Toggle common rules on and off, add custom paths, include your sitemap URL, and copy the finished file. Everything runs in your browser with no data sent anywhere.

Once your robots.txt is configured, make sure the rest of your SEO fundamentals are in place. Use our meta tag generator to create proper title tags, descriptions, and Open Graph tags for every page on your site.

Launch Your Own Clothing Brand — No Inventory, No Risk