How the Search Stack Evolved from Blue Links to LLM Answers

← Head back to Learning Hub * Last updated: 10/8/2025 * Kurt Fischman

Search engines began as ranking machines for documents. They crawled pages, built an index, and sorted results by a mix of text relevance and link authority. Google’s big unlock was PageRank, which treated links like votes and used graph structure to estimate importance. This turned a chaotic web into a hierarchy of answers and made blue links feel trustworthy at scale. The 1998 Brin and Page paper spelled out the mechanics and the mindset. Relevance flowed through the link graph. Authority lived in math, not in marketing claims.¹

When did search shift from strings to things?

Google’s Knowledge Graph formalized a quiet revolution. In 2012, Google started modeling entities and their relationships, which let results reflect the world rather than just the words on a page. The “things, not strings” launch mattered because it changed the unit of retrieval. Entities could be displayed, compared, and reasoned over in panels, not only retrieved as documents. That pushed search beyond lookups toward lightweight answers and context on the results page.² Wired covered the rollout and highlighted early scale claims about hundreds of millions of entities and billions of facts. That was the first mainstream hint that the index was becoming a knowledge base.³

Why did Google start answering questions on the page?

Featured snippets arrived in 2014 and normalized the idea that the answer should sit above the links. The source still mattered, but the SERP became a competitor to the source’s page. Google later “reintroduced” snippets in 2018 to explain intent and behavior, which confirmed the strategic direction. The platform wanted to resolve queries in situ, limit pogo sticking, and speed up time to comprehension. This tradeoff gave users speed and took publishers leverage. It also trained users to accept direct answers as a default.⁴

How did machine learning change the ranking brain?

RankBrain in 2015 signaled that learning systems would mediate queries and results. Bloomberg’s report made two points clear. RankBrain helped Google interpret unfamiliar or long-tail searches, and it rapidly became one of the top ranking factors. This was a shift from static rules to adaptive understanding. Relevance would be learned across patterns, not only derived from keywords and links. The old knobs of on-page SEO still mattered, but the interpretation layer got smarter, faster, and harder to game.⁵

What did BERT unlock in 2019?

BERT improved sentence-level understanding by modeling words in context, not in isolation. Google said it was one of the biggest leaps in five years, and the examples showed better handling of prepositions, negations, and subtle intent. That mattered for informational queries where meaning hides in small words. BERT did not replace the index. It re-weighted how queries and passages aligned, which lifted the ceiling on what the results page could correctly interpret.⁶

What changed with passage ranking?

In 2020, Google announced it could rank specific passages inside long pages. The company framed it as a ranking breakthrough that surfaces needle-in-haystack answers. That meant long content no longer needed perfect structure to be discovered at the paragraph level. It also meant the SERP could pull value from deeper inside a page without rewarding the whole page equally. This rewarded clarity inside documents and made granular relevance a first-class citizen.⁷

What is MUM and why did it matter?

In 2021, Google introduced the Multitask Unified Model. MUM used a text-to-text framework and was described as far more powerful than BERT. The intent was to shortcut multi-step research and bridge content across languages and modalities. The takeaway was not the multiplier. The takeaway was that search wanted to behave like an expert that synthesizes, compares, and reasons across sources, not an index that lists them.⁸

Microsoft fired the starting gun in early 2023 by shipping a GPT-4-powered Bing with chat, sources, and an index behind it. The pitch was conversational answers with citations. The implication was strategic. A general-purpose LLM, grounded by live retrieval, could answer and attribute in one motion. That framed the next competitive stage. Answers would be generated, not just retrieved, and provenance would be part of the product story.⁹

What is SGE and what did AI Overviews change?

Google announced its Search Generative Experience in May 2023. The company described generative AI as a way to reduce user effort and accelerate understanding. In May 2024, AI Overviews rolled out broadly in the United States with a stated plan to reach more than a billion people by year end. By October 2024, Google said AI Overviews reached 100 plus countries and were used by over a billion people monthly. The message was clear. Generative summaries are no longer a lab demo. They are the interface.¹⁰ ¹¹ ¹²

How reliable are generative answers at scale?

Reality has been mixed. AI Overviews have delivered speed and synthesis, yet high-profile glitches and basic-fact misses earned scrutiny. Wired documented a case where the overview answered the current-year query incorrectly, which crystallized the risk. Generative systems are powerful, but their confidence often exceeds their calibration. Enterprise leaders should treat the SERP as a living system where model behavior, guardrails, and retrieval quality all interact. That is not a reason to ignore it. That is a reason to monitor it with the same rigor used for security incidents and uptime.¹³

What is the modern “search stack” now?

The search stack is no longer a crawler, an index, and a ranker. The modern stack is a pipeline that moves from web capture to entity resolution to retrieval-time reasoning. At minimum, it includes these layers.

Crawl and render

Engines fetch HTML, APIs, feeds, and media, execute scripts, and build normalized representations of what a page says and what it is allowed to expose. This step is still governed by robots rules and sitemaps. That has not changed. That will not change quickly.⁷

Entity and knowledge graph

Systems resolve people, places, organizations, products, and claims into nodes and edges. This enables panels, comparisons, and constraint-based answers. The Knowledge Graph institutionalized this layer a decade ago, which is why panels feel instant and consistent.² ³

Learning to rank.

Models like RankBrain and BERT interpret intent and map it to documents and passages. This is where the old “keywords” idea mostly died. The system now optimizes for meaning rather than string overlap.⁵ ⁶

Generative synthesis

Newer layers build answers by retrieving, scoring, and composing snippets in natural language. This is where SGE and AI Overviews operate. They orchestrate retrieval and generation to produce a readable response with links. The transition from ranked links to synthesized paragraphs is the change that matters in 2025.¹⁰ ¹¹ ¹²

What is RAG and why is it the default pattern?

Retrieval-augmented generation combines a parametric model with an external index so answers can cite, refresh, and constrain themselves. The original 2020 RAG paper showed accuracy gains on knowledge-intensive tasks by letting the generator look things up at generation time. The strategic point for executives is simple. Answer engines that can retrieve on demand age better than answers that live only in weights. That is why both consumer search and enterprise assistants converge on this pattern.¹⁴

Did zero-click behavior accelerate the shift?

User behavior validated the platform’s strategy. Multiple studies using clickstream data suggest that a growing share of searches end without a click, which means the answer lived on the SERP. SparkToro’s 2024 analysis, built on Similarweb data, estimated that only about 360 out of every 1,000 U.S. Google searches produced a click to the open web. Search Engine Land reported further year-over-year increases in zero-click rates in 2025, with organic click share falling across the U.S. and EU. The blue link is not dead. The blue link is simply not the default destination.¹⁵ ¹⁶

How should brands think about “answerability” now?

Answerability is the ability of a brand’s identity, facts, and content to be discovered, grounded, and cited by answer engines. It depends on machine-readable identity, clean entity resolution, and content shaped like answers rather than brochure text. It depends on citations that can be lifted into overviews without friction. It depends on avoiding ambiguity about who you are, what you offer, and which page is canonical when an engine needs a single URL to represent a claim. The content that wins is specific, verifiable, and instrumented for machines.

Where do definitions and claims live in this new world?

Definitions live in structured data, brand fact files, and verified profiles that align with public knowledge bases. Claims live in passages that cite sources and can be excerpted cleanly. The practical move is to treat each page as two artifacts. One artifact is for humans. The other artifact is for machines and lives in schema, feeds, and machine endpoints. If your facts change, publish change feeds. If your names collide, publish disambiguation. If your leadership, locations, or prices update, anchor the change with effective dating. These are basic data hygiene moves that help models ground correctly.²

What content patterns map to LLM answers?

LLM answers tend to pull short, complete explanations from high-authority pages. They reward paragraphs that resolve a question in under 120 words, with one claim and one clear citation. They reward entity-first writing that states the subject, the action, and the object clearly in the first sentence. They reward definitions that are consistent across pages. They punish hedging, vagueness, and brand-speak that never says anything falsifiable. If you want the citation, write like you expect to be excerpted.

What are the risks and how should leaders mitigate them?

Three risks dominate. First, mis-grounding, where an engine maps your brand to the wrong entity and cites a neighbor or competitor. Second, outdated facts, where your own stale pages give the model permission to be wrong. Third, opaque changes in the answer layer that alter your traffic without notice. Leaders should manage these like supply-chain risks. Keep an identity registry, publish canonical IDs, maintain an external mappings index, and monitor entity panels in major engines. Keep a release cadence for schema and endpoint updates. Instrument your content to detect when citations appear or disappear. Publish a visible feedback channel for corrections.

How do you measure progress when clicks are declining?

Measure answer share rather than only click share. Track mentions and citations in AI Overviews and Bing chat answers for your top queries. Monitor the presence of your brand and canonical URLs in featured snippets, knowledge panels, and generative summaries. Use benchmark panels of fixed questions to detect answer drift. Compare exposure by query class, not only by rank. Give your board two numbers that summarize the new world. One number is visibility inside answers. The other number is the rate of factual errors about your brand that appear in those answers. Improvement on both is what matters.

What do the next two years likely bring?

Expect more AI-first interfaces, deeper integration between personal data and web results, and heavier use of retrieval constraints to reduce errors. Expect continued experiments with AI-only modes that de-emphasize classic SERP layouts for some users. Expect stronger disclosure and provenance cues as regulators push for transparency. Expect more countries to gain AI Overviews and more modalities to enter the answer block. Expect less room for mediocre content that never states a clear claim. Expect more reward for crisp paragraphs that resolve a question and stand on their own.¹² ¹⁷

How should an executive respond this quarter?

Act like a publisher with a modern data layer. Assign ownership over machine-readable identity. Publish a brand fact file and keep it authoritative. Map core pages to entity IDs used by public graphs. Instrument your top 200 questions with answer-ready paragraphs and durable citations. Stand up monitoring for AI answer panels on those questions. Treat schema and endpoint updates like product releases. Give your team a target that fits the times. You are not trying to rank only. You are trying to become the default citation when an LLM explains your domain.

Sources

  1. “The Anatomy of a Large-Scale Hypertextual Web Search Engine.” Sergey Brin and Lawrence Page. 1998. Computer Networks and ISDN Systems. InfoLab
  2. “Introducing the Knowledge Graph: Things, not strings.” Google Search team. 2012. Official Google Blog. blog.google
  3. “Google Revamps Search With Massive ‘Real-World Map of Things’.” Cade Metz. 2012. Wired. WIRED
  4. “A reintroduction to Google’s featured snippets.” Google Search team. 2018. The Keyword. blog.google
  5. “Google Turning Its Lucrative Web Search Over to AI Machines.” Jack Clark. 2015. Bloomberg. Bloomberg
  6. “Understanding searches better than ever before.” Pandu Nayak. 2019. The Keyword. blog.google
  7. “How AI is powering a more helpful Google.” Google Search team. 2020. The Keyword. blog.google
  8. “MUM: A new AI milestone for understanding information.” Google Search team. 2021. The Keyword. blog.google
  9. “Reinventing search with a new AI-powered Microsoft Bing and Edge.” Microsoft. 2023. Official Microsoft Blog. The Official Microsoft Blog
  10. “Supercharging Search with generative AI.” Google. 2023. The Keyword. blog.google
  11. “Generative AI in Search: Let Google do the searching for you.” Google. 2024. The Keyword. blog.google
  12. “AI Overviews in Search are coming to more places around the world.” Google. 2024. The Keyword. blog.google
  13. “Google AI Overviews Says It’s Still 2024.” Wired Staff. 2025. Wired. WIRED
  14. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” Patrick Lewis et al. 2020. NeurIPS. NeurIPS Proceedings
  15. “2024 Zero-Click Search Study.” Rand Fishkin. 2024. SparkToro. SparkToro
  16. “Zero-click searches rise, organic clicks dip: Report.” Search Engine Land. 2025. Search Engine Land
  17. “Google tests an AI-only version of its search engine.” Reuters. 2025. Reuters

FAQs

What did the “blue links” era optimize for in Google Search?
It optimized for document retrieval ranked by text relevance and link authority, with PageRank treating links as votes to infer importance across the web graph.

How did Google’s Knowledge Graph change search from “strings” to “things”?
The Knowledge Graph modeled entities and relationships so Google could serve entity-aware panels and context, shifting the retrieval unit from pages of text to structured real-world concepts.

What role did RankBrain, BERT, and passage ranking play in query understanding?
RankBrain learned intent patterns for unfamiliar queries; BERT interpreted words in context to improve sentence-level meaning; passage ranking surfaced specific paragraphs inside long pages, making granular relevance a first-class signal.

What is MUM and why does it matter to modern search?
Google’s Multitask Unified Model (MUM) aims to compress multi-step research by reasoning across languages and modalities, moving search behavior toward expert-style synthesis rather than simple listing.

How did Microsoft’s GPT-4-powered Bing Chat and Google’s SGE/AI Overviews shift the interface?
Bing combined conversational answers with citations, while Google’s SGE and AI Overviews introduced generated summaries on the SERP, making synthesized, source-backed answers the default experience for many queries.

What is Retrieval-Augmented Generation (RAG) and why is it the default answer pattern?
RAG pairs a generative model with live retrieval so answers can ground in current sources and cite them, reducing staleness and improving accuracy on knowledge-intensive tasks.

Which metrics should brands track when clicks decline in zero-click SERPs?
Track “answer share” and branded citation presence inside AI Overviews, Bing Chat, featured snippets, and knowledge panels, plus the rate of factual errors about your brand to monitor grounding quality and visibility.

The Search Stack Explained: Blue Links, Knowledge Graph, LLMs | Growth Marshal