Free Robots.txt Generator — Block AI Crawlers & Set Crawl Rules

Configure Your Robots.txt
These bots crawl your content to train AI models. Block them to protect your content.
Your robots.txt File
Download robots.txt

How to Use the Robots.txt Generator

  1. Choose your crawl access setting — "Allow All Crawlers" is the right choice for most live websites. Use "Custom Rules" to block specific directories, or "Block All Crawlers" if the site is in development and you don't want it indexed.
  2. Block AI training crawlers if you want to protect your content from being used to train AI models. Check any combination of GPTBot, CCBot, Google-Extended, anthropic-ai, PerplexityBot, and Bytespider.
  3. Add your sitemap URL so search engines can find your content efficiently. For WordPress, this is usually https://yoursite.com/sitemap.xml or /sitemap_index.xml.
  4. Copy or download the generated file, then upload it to your website's root directory (e.g., public_html/robots.txt).

What Is a Robots.txt File?

A robots.txt file is a plain text file placed in the root of your website that tells web crawlers which pages or directories they are and are not allowed to access. It follows the Robots Exclusion Protocol (REP), a standard that most legitimate search engine bots and web crawlers respect.

  • It is not a security measure — robots.txt is a polite request, not a lock. Malicious scrapers ignore it. Use proper authentication for truly private content.
  • It affects crawling, not indexing — blocking a page in robots.txt stops crawlers from visiting it, but if other sites link to it, Google may still index the URL without content.
  • It can improve crawl efficiency — telling Google not to crawl admin pages, internal search results, and staging areas helps your crawl budget focus on pages that matter.

Why Block AI Crawlers?

AI companies use web crawlers to collect content for training their language models. If you don't want your written content, product descriptions, or creative work used as AI training data, robots.txt is the standard mechanism to opt out. Key crawlers to know:

  • GPTBot — OpenAI's crawler used to train ChatGPT and other models. Blocking it is respected by OpenAI per their documentation.
  • Google-Extended — Google's separate crawler for Gemini and Vertex AI model training, distinct from Googlebot (which handles search indexing). You can block Google-Extended without affecting your Google search rankings.
  • CCBot — run by Common Crawl, a non-profit that provides web data used widely in AI research and model training.
  • anthropic-ai — Anthropic's crawler. Blocking it prevents your content from being used in Claude model training.
  • PerplexityBot — used by Perplexity AI to crawl content for its AI search and answer engine.
  • Bytespider — ByteDance's (TikTok's parent company) crawler used for AI model training.

Note: blocking these AI crawlers will not affect your rankings in Google, Bing, or other search engines, as those use different user-agent names (Googlebot, Bingbot).

Frequently Asked Questions

Where do I upload the robots.txt file?

Upload robots.txt to the root of your domain — the same folder where your homepage lives. For most WordPress sites, this is the public_html folder via FTP or your hosting file manager. The file must be accessible at https://yoursite.com/robots.txt.

Will blocking AI crawlers hurt my SEO?

No. Blocking AI training crawlers like GPTBot and Google-Extended does not affect Google Search (Googlebot) or Bing (Bingbot). These are separate user-agents. Your search engine rankings are not impacted.

Does robots.txt stop all crawlers?

It stops crawlers that follow the Robots Exclusion Protocol — which includes all major search engines and most legitimate bots. Malicious scrapers and less reputable crawlers may ignore it. Use server-level access controls or rate limiting for stronger protection.

What is the Sitemap directive in robots.txt?

Adding Sitemap: https://yoursite.com/sitemap.xml to your robots.txt tells crawlers where to find your sitemap file, making it easier for them to discover and index all your pages efficiently.

Is this tool free?

Yes, completely free with no account required. The file is generated locally in your browser — no data is sent to any server.

Explore more free SEO tools: SERP Preview Tool, Meta Tag Generator, UTM Builder, and Google Ads Character Counter.