WordPress generates a basic robots.txt automatically, but it is too minimal for most sites. Here is the recommended configuration, how to edit it with any setup, and the specific rules WooCommerce stores need.
This configuration works for the majority of WordPress sites — blogs, business sites, portfolios, and small e-commerce:
User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php Disallow: /wp-includes/ Disallow: /cgi-bin/ Disallow: /trackback/ Disallow: /xmlrpc.php Disallow: /?s= Disallow: /search/ Disallow: /*?replytocom= Disallow: /tag/*/page/ Disallow: /category/*/page/ Sitemap: https://yoursite.com/sitemap_index.xml
| Rule | Purpose | Why It Matters |
|---|---|---|
| Disallow: /wp-admin/ | Blocks the WordPress admin dashboard | Crawlers should not index login screens and admin pages |
| Allow: /wp-admin/admin-ajax.php | Allows AJAX requests used by themes and plugins | Many front-end features break if this is blocked — forms, search, dynamic content |
| Disallow: /wp-includes/ | Blocks WordPress core files | Core PHP files are not useful for search engines to index |
| Disallow: /cgi-bin/ | Blocks server scripts directory | Standard security practice — no user-facing content here |
| Disallow: /trackback/ | Blocks trackback URLs | Trackbacks are outdated and a spam vector — no reason to crawl them |
| Disallow: /xmlrpc.php | Blocks XML-RPC endpoint | Used for pingbacks and remote access — also a common attack target |
| Disallow: /?s= | Blocks search result pages | Internal search results are thin content — let Google index your real pages instead |
| Disallow: /*?replytocom= | Blocks comment reply URLs | Prevents duplicate content from threaded comment links |
| Disallow: /tag/*/page/ | Blocks paginated tag archives | Saves crawl budget — page 2, 3, 4 of tag archives add little value |
| Disallow: /category/*/page/ | Blocks paginated category archives | Same as tags — paginated archives waste crawl budget |
| Sitemap: ... | Points crawlers to your sitemap | Essential — tells crawlers exactly where to find all your content |
Note: if the File editor tab is missing, your hosting provider may have disabled file editing. Use FTP instead.
robots.txt — a plain text file, no special encodingIf you run WooCommerce, add these rules to block pages that contain session-specific or private data:
# WooCommerce specific Disallow: /cart/ Disallow: /checkout/ Disallow: /my-account/ Disallow: /wishlist/ Disallow: /*?add-to-cart=* Disallow: /*?orderby=* Disallow: /*?filter_*
These rules block cart pages, checkout flows, account dashboards, and product filter URLs from being crawled. Product pages and category pages remain fully accessible to search engines.
| Mistake | Why It Is Harmful | Fix |
|---|---|---|
| Blocking /wp-content/ | Prevents Google from loading CSS, JS, and images — your pages render as broken | Remove the Disallow: /wp-content/ rule entirely |
| No Sitemap directive | Crawlers rely on the Sitemap line to discover content efficiently | Add Sitemap: https://yoursite.com/sitemap_index.xml |
| Blocking /feed/ | Prevents RSS syndication and content discovery | Remove unless you have a specific reason to block feeds |
| Blocking entire /wp-content/uploads/ | Your images disappear from Google Image Search | Never block uploads — your media lives here |
| Using Disallow to hide pages from search | Disallow prevents crawling but not indexing — pages can still appear in search results | Use noindex meta tag (via Yoast or Rank Math) to remove pages from search results |
| Not testing after changes | A single typo can block your entire site | Always visit yoursite.com/robots.txt after editing and check Google Search Console |
Generate the right robots.txt for your WordPress site — paste it in and go.
Open Robots.txt Generator