Blog
Custom Print on Demand Apparel — Free Storefront for Your Business
Wild & Free Tools

What Is Robots.txt? A Beginner Guide to the File That Controls Search Crawlers

Last updated: April 20267 min readSEO Tools

Robots.txt is like a "staff only" sign on a door. Search engine crawlers check it before entering your site. Polite bots follow the rules. Some do not. But without a sign at all, every bot walks in everywhere — including places you did not want them to go.

What Robots.txt Actually Does

When Google, Bing, or any search engine sends a bot to crawl your website, the bot's first stop is always the same: yoursite.com/robots.txt. It reads this file to find out:

The file is plain text. No special formatting. No code. Just simple rules that bots understand.

Step 1: Check If You Already Have One

Open your browser and go to:

yoursite.com/robots.txt

If you see text with rules like "User-agent" and "Disallow," you already have a robots.txt file. If you see a 404 error, you do not have one — and that means crawlers are accessing everything on your site without any guidance.

Step 2: Understand the Basic Rules

Robots.txt uses four main instructions. That is it. Four:

InstructionWhat It MeansExample
User-agentWhich bot these rules are for (* means all bots)User-agent: *
DisallowThis path is off limits — do not crawl itDisallow: /admin/
AllowThis path is okay — override a Disallow ruleAllow: /admin/public-page/
SitemapHere is where all my pages are listedSitemap: https://yoursite.com/sitemap.xml

A complete robots.txt file is just combinations of these four instructions. Nothing more complicated than that.

Step 3: Create Your Robots.txt

The fastest way: use the Robots.txt Generator. Select which directories you want to block, which bots you want to target, and it writes the correct file for you. Download it and upload it to your website.

If you want to write it by hand, here is a simple starting point that works for most websites:

User-agent: *
Disallow: /admin/
Disallow: /private/
Disallow: /staging/

Sitemap: https://yoursite.com/sitemap.xml

This tells all bots: you can crawl everything except the admin, private, and staging directories. And here is the sitemap so you can find all the public pages efficiently.

Step 4: Upload It

  1. Save your file as robots.txt (plain text, lowercase, no extension tricks)
  2. Upload it to the root directory of your website — the same folder where your homepage files live
  3. Verify by visiting yoursite.com/robots.txt in your browser
  4. You should see your rules displayed as plain text

If you use WordPress, you can edit robots.txt through your SEO plugin (Yoast or Rank Math) without touching FTP. See our WordPress robots.txt guide for step-by-step instructions.

Three Things Robots.txt Does NOT Do

These are the most common misconceptions — and getting them wrong can cause real problems:

MisconceptionReality
Robots.txt hides pages from GoogleWrong. It tells Google not to CRAWL the page, but Google can still INDEX the URL if other websites link to it. The page can appear in search results with no description. To truly hide a page, use a noindex meta tag.
Robots.txt is securityWrong. Anyone can read your robots.txt file by visiting yoursite.com/robots.txt. In fact, it often reveals which directories exist. Never rely on robots.txt to protect sensitive content. Use passwords and authentication.
Blocked pages are invisibleWrong. Robots.txt blocks crawling, not linking. If another website links to a page you have blocked, search engines may still show that URL in results — they just cannot access the content to display a description.

Real-World Analogy

Imagine your website is an office building:

When You Definitely Need a Robots.txt

Tools to Get Started

Create a robots.txt file in seconds — no technical knowledge needed.

Open Robots.txt Generator
Launch Your Own Clothing Brand — No Inventory, No Risk