FREE TOOL

Robots.txt Checker

Fetch and analyze any website's robots.txt file. See user-agent rules, disallow patterns, sitemap references, and crawl directives.

What is robots.txt?

Robots.txt is a plain text file at the root of a website that tells search engine crawlers which pages or sections they should or should not visit. It follows the Robots Exclusion Protocol, a standard that all major search engines respect.

Why does robots.txt matter for security?

While robots.txt is primarily an SEO tool, it has security implications. Disallow rules can inadvertently reveal the existence of sensitive paths (admin panels, internal APIs, staging environments). Attackers often check robots.txt first to find hidden endpoints. Never rely on robots.txt as a security measure - use proper authentication and access controls instead.

What are common robots.txt directives?

User-agent specifies which crawler a rule applies to (* means all). Disallow blocks a path from being crawled. Allow overrides a Disallow for a specific sub-path. Sitemap tells crawlers where your XML sitemap lives. Crawl-delay asks crawlers to wait between requests (not respected by all bots).

Best practices

Always include a Sitemap directive pointing to your XML sitemap. Do not list sensitive paths in Disallow rules (this reveals them). Use specific user-agent blocks if you need different rules for different bots. Keep the file concise - overly complex robots.txt files can cause parsing issues.

FAQ

Frequently asked questions

Is this robots.txt checker free?+

Yes, completely free and instant. Analyze any public website's robots.txt - no signup required.

Does robots.txt block hackers?+

No. Robots.txt is a voluntary standard - legitimate crawlers respect it, but malicious bots ignore it entirely. Never use robots.txt as a security mechanism. Use authentication, firewalls, and access controls to protect sensitive resources.

What if my site has no robots.txt?+

Without a robots.txt file, search engines will crawl all accessible pages on your site. This is usually fine for small sites, but larger sites benefit from a robots.txt to manage crawl budget and prevent indexing of duplicate or low-value pages.

What does Disallow: / mean?+

Disallow: / tells crawlers not to access any page on your site. This effectively removes your entire site from search engine results. Use this only if you genuinely want to prevent all indexing - for example, on staging or development environments.

FULL SECURITY AUDIT

Robots.txt Checker is just the start.

CQwerty Shield checks SSL, DMARC, SPF, DNS, HTTP headers, WHOIS, breach intel, and more — with CVE/KEV cross-references on every finding.

Free full scan — no signup