Keyword clustering — grouping keywords for SEO
What is keyword clustering?
Keyword clustering is the process of grouping keywords into clusters with a shared search intent (SERP intent), so that each cluster targets exactly one destination page. Goal: maximize the semantic reach of a page without cannibalization and without building separate URLs for synonyms.
Instead of writing 50 separate pages for 50 queries, clusters let you optimize 10-15 pages, each ranking for 10-30 related queries.
Why does it matter?
- Elimination of cannibalization — when 2 pages rank for the same thing, Google lowers both. Clusters prevent this
- Higher content efficiency — one page = many keywords = more clicks
- Better topical mapping — clusters are the foundation of topical mapping
- Scalability — instead of 500 micro-pages, you leave 50 strong pillars
- Better structure planning — clusters show how many URLs you actually need
How does SERP-based clustering work?
The most reliable method doesn't use word2vec similarity, but real Google SERPs:
- Fetch top 10 URLs for each query from the list
- Compare overlap — if 2 queries have ≥3 common URLs in top 10, they belong to the same cluster
- Build a graph — queries are vertices, common URLs are edges; clusters are connected components
- Pick the pilot keyword — query with highest search volume in the cluster
Why SERP overlap is best: Google knows best which queries have the same intent — it shows the same top 10. If it shows different URLs, intents differ and require separate pages.
Types of search intent
Each cluster has one of 4 base intents:
1. Informational
- "what is [X]", "how does [X] work", "[X] vs [Y]"
- Clusters → blog, glossary, guides
- Conversion: low, but builds authority + remarketing
2. Navigational
- "[brand] login", "[product] pricing"
- Clusters → brand pages
- Conversion: medium, user already knows the brand
3. Commercial Investigation
- "best [X]", "[X] review", "[X] alternatives", "[X] for small companies"
- Clusters → comparison pages, listings, lists
- Conversion: high, user close to decision
4. Transactional
- "buy [X]", "[X] price", "order [X]"
- Clusters → product pages, landing pages
- Conversion: highest
How to do clustering — step by step
Step 1: collect the query list
- From Google Search Console (already ranking)
- From Ahrefs/Semrush (potential)
- From Google Autocomplete + AlsoAsked
- From competition (Ahrefs site explorer)
- From keyword research
Typically 500-3000 queries at the start.
Step 2: collect SERPs
- For each query → SerpAPI/DataForSEO API → top 10 URLs
- Cost: ~5 USD per 1000 queries
Step 3: build overlap matrix
- 1000 queries × 1000 queries = 1M pairs
- For each pair: number of common URLs in top 10
Step 4: apply threshold
- ≥3 common URLs → same cluster
- Connected components in graph = final clusters
Step 5: manual validation
- Check top 10 clusters — is the macro-intent really uniform?
- Drop clusters with mixed intent
Step 6: assign URLs
- One cluster → one existing page (update) or new (write)
- Pilot keyword in title/H1, others in H2/H3/body
Tools
- MarketingOS —
clihas cannibalization audit + striking distance per cluster - Keyword Insights — best dedicated tool ($)
- Ahrefs Parent Topic — automatic clusters with SERP overlap
- Surfer Keyword Clusterer — built into Surfer
- Custom Python + DataForSEO API — cheapest, most flexible
Common mistakes
- Clustering by semantic similarity — "SEO" and "search engine optimization" are semantically similar but may have different SERPs
- Clusters with mixed intent — "CRM" (informational) and "CRM for 50-person companies" (commercial) → separate clusters
- Threshold too low — ≥1 common URL gives too loose clusters; ≥3 is the sweet spot
- Ignoring SERP features — featured snippet, video carousel, "people also ask" change intent
- No refresh — SERPs change every 3-6 months; clusters need refreshing
Keyword clustering and B2B SEO
In B2B, clusters are usually smaller (5-15 queries vs 50+ in e-commerce), but yield higher ROI:
- Each query has low volume but high intent
- The cluster ties them into 1 pillar page of 1500-3000 words
- The page ranks for the entire long-tail of intent simultaneously
- For ABM — clusters per industry/use case
Related terms
- Keyword research — input to clustering
- SERP — source of overlap signal
- Topical mapping — higher structure level
- Topical authority — effect of good clustering
- Long-tail keywords — most queries in cluster
- Cannibalization — what you eliminate through clustering