Robots.txt is like a "staff only" sign on a door. Search engine crawlers check it before entering your site. Polite bots follow the rules. Some do not. But without a sign at all, every bot walks in everywhere — including places you did not want them to go.
When Google, Bing, or any search engine sends a bot to crawl your website, the bot's first stop is always the same: yoursite.com/robots.txt. It reads this file to find out:
The file is plain text. No special formatting. No code. Just simple rules that bots understand.
Open your browser and go to:
yoursite.com/robots.txt
If you see text with rules like "User-agent" and "Disallow," you already have a robots.txt file. If you see a 404 error, you do not have one — and that means crawlers are accessing everything on your site without any guidance.
Robots.txt uses four main instructions. That is it. Four:
| Instruction | What It Means | Example |
|---|---|---|
| User-agent | Which bot these rules are for (* means all bots) | User-agent: * |
| Disallow | This path is off limits — do not crawl it | Disallow: /admin/ |
| Allow | This path is okay — override a Disallow rule | Allow: /admin/public-page/ |
| Sitemap | Here is where all my pages are listed | Sitemap: https://yoursite.com/sitemap.xml |
A complete robots.txt file is just combinations of these four instructions. Nothing more complicated than that.
The fastest way: use the Robots.txt Generator. Select which directories you want to block, which bots you want to target, and it writes the correct file for you. Download it and upload it to your website.
If you want to write it by hand, here is a simple starting point that works for most websites:
User-agent: * Disallow: /admin/ Disallow: /private/ Disallow: /staging/ Sitemap: https://yoursite.com/sitemap.xml
This tells all bots: you can crawl everything except the admin, private, and staging directories. And here is the sitemap so you can find all the public pages efficiently.
robots.txt (plain text, lowercase, no extension tricks)yoursite.com/robots.txt in your browserIf you use WordPress, you can edit robots.txt through your SEO plugin (Yoast or Rank Math) without touching FTP. See our WordPress robots.txt guide for step-by-step instructions.
These are the most common misconceptions — and getting them wrong can cause real problems:
| Misconception | Reality |
|---|---|
| Robots.txt hides pages from Google | Wrong. It tells Google not to CRAWL the page, but Google can still INDEX the URL if other websites link to it. The page can appear in search results with no description. To truly hide a page, use a noindex meta tag. |
| Robots.txt is security | Wrong. Anyone can read your robots.txt file by visiting yoursite.com/robots.txt. In fact, it often reveals which directories exist. Never rely on robots.txt to protect sensitive content. Use passwords and authentication. |
| Blocked pages are invisible | Wrong. Robots.txt blocks crawling, not linking. If another website links to a page you have blocked, search engines may still show that URL in results — they just cannot access the content to display a description. |
Imagine your website is an office building:
Create a robots.txt file in seconds — no technical knowledge needed.
Open Robots.txt Generator