What is robots.txt?

Robots.txt is a text file placed in the root directory of a website (/robots.txt) that tells search engine crawlers which pages they can scan and which ones they should skip. It is the first thing Googlebot (and other crawlers, including AI bots) checks before scanning a page.

The robots.txt file does NOT block indexing — a page can appear in Google results even without being scanned (e.g., if other pages link to it). To block indexing, use the noindex meta tag.

Why does it matter?

Crawl budget control — blocking unimportant pages (admin, duplicates) conserves crawling resources
Resource protection — blocking admin pages, dev versions, internal search pages
AI crawlers — controlling access for GPTBot, ClaudeBot, and PerplexityBot to your content
Sitemap reference — robots.txt is the standard place to include the sitemap URL

How does it work?

User-agent: *
Allow: /
Disallow: /_next/
Disallow: /api/

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: PerplexityBot
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

Key directives:

User-agent — which crawler the rule applies to (* = all)
Allow — explicit permission to scan a path
Disallow — blocks scanning of a path
Sitemap — the URL of the sitemap file

Best practices

Do not block important resources — CSS, JS, and images must be accessible for page rendering
Allow AI crawlers — for GEO, it is important that GPTBot, ClaudeBot, and PerplexityBot can scan your content
Block admin and duplicates — /admin/, /api/, internal searches
Add a sitemap — Sitemap: https://... at the end of the file
Test in GSC — Google Search Console has a tool for testing robots.txt
Do not use it to hide content — robots.txt does not protect private data

More on technical optimization in the article on technical SEO.

Related terms

Sitemap — site map for search engines
Crawlability — the ability of a site to be crawled
Crawl budget — crawl limit
GEO — optimization for AI
Indexing — the process of adding pages to Google's index

Robots.txt

What is robots.txt?

Why does it matter?

How does it work?

Best practices

Related terms

Need help?

Related articles

How to Speed Up Google Indexing? 7 Methods and Time to Result

Google Indexing API — complete tutorial with ready-to-use script [2026]

Sitemap.xml — how to submit to Google and speed up crawl [2026]