Crawlability
What is crawlability?
Crawlability is a website's ability to be found and scanned by search engine bots (crawlers, e.g. Googlebot). If a crawler cannot reach your page or its subpages, they will not be indexed — regardless of content quality or backlinks.
Crawlability is the foundation of technical SEO — if Google cannot scan a page, no other SEO efforts will produce results.
Why does it matter?
- Prerequisite — a page must be crawlable before it can be indexed and displayed in SERP
- Crawl budget — Google allocates a limited crawling budget to each site
- New content — good crawlability speeds up the discovery and indexing of new pages
- Problem diagnosis — crawling blocks are a common cause of lack of visibility in Google
How does it work?
Googlebot discovers pages in two ways:
- Through links — by following internal and external links
- Through the sitemap — by reading the sitemap.xml file containing a list of all important URLs
It then checks the robots.txt file, which specifies which parts of the site it can scan. If a page is blocked in robots.txt or returns errors (404, 500), it will not be indexed.
Best practices
- Correct robots.txt — do not accidentally block important pages
- Up-to-date XML sitemap — submit it in Google Search Console
- Logical internal link structure — every important page should be reachable within 3 clicks maximum
- No redirect chains — each redirect should point directly to the final destination
- Fast server response time — a slow server wastes crawl budget
- Fix 404 and 500 errors — monitor them in Google Search Console
More on diagnostics in our guide to speeding up indexing.
Related terms
- Indexing — the process of adding a page to Google's index
- Crawl budget — the crawling limit allocated by Google
- Robots.txt — a file controlling crawler access
- Sitemap — a site map for search engines
- SEO — search engine optimization