Skip to main content

    Robots.txt Generator

    Generate a robots.txt file for your website. Control search engine crawling with an easy visual builder.

    No signup. 100% private. Processed in your browser.

    Configure your crawling rules below and copy or download the generated robots.txt file.

    Crawling Rules

    Generated robots.txt

    User-agent: *
    Allow: /
    Disallow: /admin
    Disallow: /api
    Disallow: /login
    
    Sitemap: https://example.com/sitemap.xml

    What Is robots.txt?

    The robots.txt file is a plain text file at your website's root that tells search engine crawlers which pages they're allowed to visit and which they should skip. It's the first file a well-behaved crawler checks before accessing any page on your site.

    Think of it as a "staff only" sign for web crawlers. It's a polite request, not a security measure — crawlerschoose to respect it. Googlebot and Bingbot follow the rules reliably. Malicious bots and scrapers typically ignore it entirely. Never use robots.txt as your only defence for sensitive content.

    Directive Reference

    DirectiveSyntaxWhat It Does
    User-agentUser-agent: *Specifies which crawler the rules apply to (* = all)
    AllowAllow: /public/Permits crawling of a specific path
    DisallowDisallow: /admin/Blocks crawling of a specific path
    SitemapSitemap: https://…/sitemap.xmlPoints crawlers to your XML sitemap
    Crawl-delayCrawl-delay: 10Requests N seconds between requests (not respected by Google)

    What this means for you: Google ignores Crawl-delay — use Search Console's crawl rate settings instead. Bing and Yandex do respect it. The Sitemap directive is the most important after Allow/Disallow.

    Common robots.txt Patterns

    Scenariorobots.txtNotes
    Allow everythingUser-agent: * Allow: /Default for most sites — let crawlers see everything
    Block everythingUser-agent: * Disallow: /Staging/dev sites only — never do this in production
    Block admin pathsDisallow: /admin/ Disallow: /api/Standard security hygiene — don't expose backend routes
    Block a specific botUser-agent: AhrefsBot Disallow: /Blocks aggressive SEO crawlers that waste bandwidth

    Common Mistakes

    Blocking CSS and JS Files

    Google needs to render your pages to understand them. Blocking CSS/JS files in robots.txt prevents rendering and can hurt your rankings. Only block truly private resources.

    Using robots.txt for Security

    Disallow doesn't hide pages — it just asks crawlers not to visit them. The URLs are still visible in the file. Use authentication and proper access controls for sensitive content.

    Forgetting to Update After Redesign

    Site redesigns often change URL structures. If your old robots.txt blocks paths that are now important, those pages won't get crawled. Review robots.txt after every major change.

    Confusing Disallow with Noindex

    Disallow prevents crawling. Noindex prevents indexing. A page blocked by robots.txt can still appear in search results if other sites link to it. Use noindex meta tags to prevent indexing.

    AI Crawlers You Should Know About

    Bot NameCompanyUser-AgentWhat It Does
    GPTBotOpenAIGPTBotCrawls content for training ChatGPT models
    Google-ExtendedGoogleGoogle-ExtendedTraining data for Gemini/Bard AI models
    CCBotCommon CrawlCCBotOpen dataset used by many AI companies
    anthropic-aiAnthropicanthropic-aiCrawls for Claude model training
    ClaudeBotAnthropicClaudeBotWeb browsing for Claude responses

    To block all AI training crawlers, add User-agent: GPTBot and Disallow: / blocks for each bot. Blocking search engine crawlers (Googlebot, Bingbot) is a separate decision — those affect your search rankings, not AI training. You can block AI training while keeping your search presence.

    Related Tools

    How to use this tool

    1

    Enter your sitemap URL and toggle common blocking rules (admin, API, login)

    2

    Add custom disallow paths for any additional routes to block

    3

    Copy or download the generated robots.txt file and upload to your site root

    Common uses

    • Blocking admin and login pages from search engine crawling
    • Preventing API endpoints from appearing in search results
    • Setting up robots.txt for new website deployments
    • Adding sitemap references for improved search engine discovery

    Share this tool

    Frequently Asked Questions