What is information gain?

Information gain is a Google ranking signal described in patent US9317592B1 from 2014, titled "Content-based ranking signals." It measures how much new, unique information your content brings relative to pages already ranking for the same query.

In practice: if 10 pages in SERP say the same thing, the 11th page with identical content doesn't deserve ranking. A page that adds a new angle, data, experiments, fresh perspectives — gains an information gain signal and stands a chance of outranking competitors.

Why does it matter?

The Helpful Content algorithm (2022) and Spam Updates (2024) strongly use this signal to filter duplicated content
AI Overviews and GEO — language models cite sources bringing unique data, not those paraphrasing
Long-tail visibility — content with gain dominates less competitive (long-tail) queries
Defensive SEO — without gain, even a well-optimized page loses positions to fresher competition

How does Google measure information gain?

The patent describes a 3-stage process:

Top-N corpus — Google fetches top 10-20 pages for the query
Entity & claim extraction — extracts entities (people, places, concepts) and claims from each page
Comparative scoring — compares your page with the aggregated corpus; more new entities/claims = higher gain score

In MarketingOS we implement this signal through the infogain audit module, building a SERP coverage matrix and pointing out gain points — specific fragments to add.

How to increase information gain?

1. Audit SERP before writing

Check top 10 pages for your query
Extract their H2/H3 (topic structure) and dominant entities
Identify gaps — topics covered by 1-2 pages but not the rest

2. Bring unique data

Internal metrics — own case studies, A/B tests, Google Search Console data
Fresh benchmarks — comparison of multiple tools/strategies with your setup
Experiments — description of test and conclusions nobody published yet
Insider knowledge — market context, product decisions, expert perspectives

3. Add entities missing from corpus

Enrich text with named entities (people, companies, tools, standards, dates)
Each entity is a potential topical signal — link to entity in knowledge graph
Target: 15+ unique entities per page (Google Cortez 2026 threshold)

4. Atomic answers

Look at content as a set of atomic answers (entity + image + content atoms)
Each fragment must carry one unique piece of information extractable by AI
Helpful formats: FAQ, definitions, statistics, step-by-step lists

Common mistakes

Paraphrasing top-rankers — AI detectors and Google recognize this (zero gain)
Theory only, no practice — corpus already contains theory; add "how we did it"
No numbers — raw data (percentages, KPIs, times) is a strong gain signal
Copy-pasted FAQ from competitors — Google groups duplicates and picks the original

Information gain and GEO

In the context of GEO (Generative Engine Optimization) information gain works even stronger — LLMs (ChatGPT, Perplexity, Gemini) pick unique sources to cite, because duplicated content gets suppressed in the response deduplication phase.

Pages with high gain earn:

AI citations in generative responses
Brand mentions in AI narratives
Zero-click visibility despite no click

Related terms

Topical authority — thematic authority
Topical mapping — content cluster mapping
E-E-A-T — expertise signals
GEO — AI optimization
Helpful content — helpful content
SERP — search engine results page

Information gain — how Google ranks content that surpasses SERP

What is information gain?

Why does it matter?

How does Google measure information gain?

How to increase information gain?

1. Audit SERP before writing

2. Bring unique data

3. Add entities missing from corpus

4. Atomic answers

Common mistakes

Information gain and GEO

Related terms

Need help?

Related articles

E-E-A-T in Content — How to Build Trust With Google and AI [2026]

The SEO Content Brief — How to Commission Content That Ranks [2026]

SEO, GEO and AEO in 2026: the complete guide to the new search era and building websites that earn