Google's Search Signals Revealed

For years, SEOs operated on a mixture of correlation studies, patent interpretations, and whatever Google's public liaisons chose to disclose. That changed under oath. The sworn testimony from Google's Vice President of Search, Pandu Nayak, explaining the architecture of Google's ranking systems during the DOJ antitrust trial, wasn't speculation or a leaked document without context-it was a Google executive explaining how the system actually works under penalty of perjury.

Then in January 2026, a second wave hit. Elizabeth Reid, Google's current Vice President and Head of Search, filed an affidavit warning that court-mandated sharing of Google's search index, rankings, and user data would expose proprietary systems and enable reverse engineering. Her filing inadvertently confirmed details about Glue, RankEmbed, crawl scheduling, spam scoring, and index tiering that Google had spent decades keeping secret. What follows isn't another rehash of "200 ranking factors." It's a practitioner's synthesis of what the trial transcripts, the Reid affidavit, and the 2024 Content Warehouse API leak collectively reveal about how Google actually decides which pages deserve to rank-and what you should do about it.

The Two Master Signals: Quality (Q) and Popularity (P)

Strip away every subsystem, every twiddler, every machine learning refinement layer, and Google's ranking architecture reduces to two master-level inputs. The trial revealed that Google's system boils down to two "fundamental top-level ranking signals" that determine a webpage's score: Quality (Q*), which assesses trustworthiness and authority heavily influenced by PageRank and distance from known good sources, and Popularity (P*), which measures how widely visited and well-linked a page is, directly powered by Chrome visit data and user interaction signals from NavBoost.

This matters because it reframes every tactical SEO decision. You're no longer optimizing for a mysterious black box with hundreds of unknowable variables. You're building evidence that satisfies two questions: Is this trustworthy? and Do real people find this useful?

How Quality (Q*) Gets Calculated

Google's ranking stack separates Topicality (T*) from Quality (Q*). In a trial exhibit, Google engineer Hyung-Jin Kim described Q* as largely static and "related to the site rather than the query," with PageRank feeding into that quality signal. This is a site-wide assessment, not a page-by-page evaluation that shifts with every query.

The unredacted remedial opinion confirmed that Quality incorporates signals of authoritativeness, content-derived metrics, and rater scoring, and is directly used in determining a page's ranking. Think of it as a composite trust score. PageRank feeds into it, but so does the Body (B) signal from the ABC framework-the text on your actual page. The Anchors (A) and Clicks (C) components serve as direct inputs to the Popularity (P*) signal, while the Body (B) component is a content-derived metric that feeds into the Quality signal.

The practical implication: your page text isn't just serving users. It's the primary input to the quality half of your ranking score.

How Popularity (P*) Gets Calculated

The DOJ remedial opinion revealed that Google's Popularity signal (P*) draws on Chrome browsing data and the number of anchors to measure how "well-linked" and widely visited a page is. This was the single most consequential confirmation of the trial. Google had for years denied or sidestepped questions about whether Chrome data influenced rankings.

The explicit confirmation of Chrome visit data as a direct input for the Popularity signal was particularly significant, as Google had historically been less direct about the extent to which it leverages its browser's data for ranking purposes. The Popularity signal combines three data streams: Chrome browsing behavior, anchor link structures, and user interaction data from NavBoost. Pages that are both well-linked and frequently visited get a compounding advantage.

NavBoost: The Click Memory System Google Didn't Want You to Know About

Google has consistently claimed in the past that direct user signals such as clicks are not direct ranking factors. However, court documents from the antitrust case reveal a completely different reality.

Nayak's testimony confirmed unequivocally that Google does use clicks for ranking through a system called NavBoost, one of Google's most important ranking signals. It works by analyzing a rolling 13-month window of aggregated user click data. As the court documents stated, "Learning from this user feedback is perhaps the central way that web ranking has improved for 15 years."

NavBoost isn't a machine learning system. As Dr. Eric Lehman testified, "Navboost is not a machine learning system. It's just a big table. It says for this search query, this document got two clicks. For this query, this document got three clicks, and so on. And it's aggregated, and there's a little bit of extra data." It's a giant lookup table pairing queries with click patterns. Simple in concept. Devastating in impact.

The Click Types That Matter

The 2024 Content Warehouse API leak revealed the specific metrics NavBoost tracks. The Craps module handles storage and processing of click signals, including badClicks-where the user quickly returns to the search results (pogo-sticking) signaling dissatisfaction-and lastLongestClicks, which identify the final result a user dwells on, suggesting their search journey ended successfully.

NavBoost answers a fundamentally different question than traditional ranking: "How satisfying do real users find this document when they are searching for this specific query?" It uses aggregated click behaviour as a direct measure of user satisfaction and likely operates on a COEC model (Clicks Over Expected Clicks), where a result in the fourth position getting significantly more clicks than average would receive a ranking boost.

Context Slicing: Device, Location, Time

NavBoost doesn't treat all clicks equally. The system segments its click data by geographic location and device type, allowing NavBoost to prioritize results that have performed well for users in a similar situation-for example, boosting a local business's website for mobile users in a specific city.

NavBoost is a ranking signal that can only exist after users have clicked on a document/page, and it creates different datasets (slices) for mobile vs. desktop searches. If your mobile page provides a poor experience while your desktop version performs well, the two surfaces are evaluated independently.

Glue and Instant Glue: The Real-Time SERP Assembler

While NavBoost handles the "blue links," a companion system called Glue processes every other SERP interaction. Glue is another name for the NavBoost ecosystem but includes all other rich features on a given SERP. This system aggregates all user interactions such as "clicks, hovers, scrolls, and swipes" and creates a common metric that compares search results and features.

Reid's affidavit offered a bombshell detail about the data volume. Glue captures 13 months of U.S. search logs, and Google argued this would amount to a massive, ongoing disclosure of Google's ranking output at scale. The system logs include queries, locations, clicks, hovers, and result orders-essentially a complete behavioral record of how Americans interact with Google.

A component called Instant Glue operates on a 24-hour data window with a latency of approximately 10 minutes, allowing Google to adapt rankings to real-time events. Breaking news queries, trending topics, and sudden shifts in search behavior feed into this near-real-time pipeline. For time-sensitive content, this means Google can recalibrate within minutes of a major event.

RankEmbed BERT: Where Human Judgment Trains the Machine

NavBoost is about remembering what users click. RankEmbed BERT is about understanding what users mean. The DOJ remedial opinion makes clear that human quality raters are not just external evaluators-their judgments directly shape the core of Google's ranking systems. The RankEmbed and RankEmbedBERT models are trained on two primary sources of data: search logs and human rater scores. This elevates rater input from "guidance" to direct training data.

RankEmbed trains on a rolling window of about 70 days of search data and also learns from human quality raters who score pages for expertise, trustworthiness, and clarity. Their evaluations don't directly boost or demote pages but teach the model what "good" looks like.

This is where E-E-A-T stops being an abstract marketing concept and becomes a measurable input. Rater-trained RankEmbedBERT models significantly improved Google's ability to process complex, long-tail queries, where language understanding is essential. Human raters evaluate pages based on the criteria in Google's Quality Rater Guidelines. Those evaluations train the model. The model then scales those judgments across billions of queries. If your page doesn't meet the standards a quality rater would apply, RankEmbed BERT has been trained to recognize that pattern.

RankEmbed is notably efficient, trained on 1/100th of the data used for earlier ranking models, yet it provides higher quality search results. Google achieved more with less because human rater data provides a cleaner training signal than raw click logs alone.

FastSearch: How AI Overviews Get Their Answers

A separate but related system connects this to AI Overviews. FastSearch is based on RankEmbed signals-a set of search ranking signals-and generates abbreviated, ranked web results that a model can use to produce a grounded response. When Google's AI generates an overview, it doesn't query the full search index. It uses FastSearch, which relies heavily on RankEmbed rather than the traditional link-heavy ranking pipeline. There are likely spam and quality signals that don't get computed for FastSearch either, which would explain how early versions showed spammy and even penalized sites in AI Overviews.

For practitioners, this means optimizing for AI Overviews requires semantic relevance and depth over link volume-a fundamentally different game than traditional organic SEO.

The Spam Scoring System Hiding Behind Every DocID

Every page in Google's index has a unique identifier-a DocID-loaded with associated signals. Every page has a DocID packed with signals from clicks to spam scores. Among those signals is a spam score that most site owners have never seen but that shapes their visibility every day.

Reid stressed in her affidavit that "fighting spam depends on obscurity, as external knowledge of spam-fighting mechanisms or signals eliminates the value of those mechanisms and signals." Google argued that disclosing spam scores to competitors would cripple its anti-spam infrastructure. That argument itself confirms how central spam scoring is to the ranking pipeline.

Every website in Google's index receives a spam score that influences both crawling frequency and ranking decisions. This score is likely based on a combination of content quality, user engagement metrics, technical implementation, and compliance with quality guidelines. The existence of a formal spam scoring system explains many sudden ranking drops that websites experience.

This system creates a competitive dynamic where maintaining high content quality becomes essential not just for rankings but for basic crawling and indexing. Sites with poor spam scores may find their new content crawled less frequently, creating a negative feedback loop that compounds over time. If your crawl rate drops unexpectedly, a rising spam score may be the culprit-not a server issue.

Hand-Crafted Signals Still Run the Show (For Now)

One of the most underreported revelations from the trial is this: Google's algorithm is not an opaque neural network. The trial revealed that Google's search ranking systems are fundamentally grounded in signals that are "hand-crafted" by its engineers. Most ranking factors are engineered manually, not black-box ML, for control and stability.

Engineer HJ Kim explained the rationale: "The reason why the vast majority of signals are hand-crafted is that if anything breaks Google knows what to fix. Google wants their signals to be fully transparent so they can trouble-shoot them and improve upon them." This is a direct contrast to the narrative many SEOs have internalized-that Google's system is some unknowable AI black box where optimization is futile.

Machine learning models like RankBrain and BERT act as a final refinement layer to better understand semantics, rather than replacing foundational signals. The hand-crafted foundation handles most of the heavy lifting. ML models adjust the final 20-30 documents at the top. And even then, Nayak said: "I think it's risky for Google-or for anyone else, for that matter-to turn over everything to a system like these deep learning systems as an end-to-end top-level function. I think it makes it very hard to control."

This means the foundational principles of SEO-topical relevance, link authority, content quality, technical accessibility-remain the bedrock of ranking. They're not legacy factors being replaced by AI. They're engineered signals that Google deliberately maintains.

What the Antitrust Remedies Mean for SEO's Future

In September 2025, Judge Mehta ruled that Google would not be required to divest Chrome or Android but would be barred from including search in exclusive contracts and required to share some data with competitors. Various portions of this ruling have been appealed by both the DOJ and Google, with the dispute still in progress as of early 2026.

The DOJ's ruling cracks open something arguably more consequential: Google must share portions of its search index and user-interaction data with qualified competitors. Rivals-from Bing to emerging AI search engines-will finally gain visibility into signals like crawl dates, spam scores, click patterns, and the RankEmbed model data.

On February 4, 2026, the DOJ and a coalition of states formally appealed, arguing that behavioral remedies amount to a slap on the wrist for a "recidivist monopolist." The D.C. Circuit's decision is expected by late 2026. For practitioners, the implications are structural. If competitors gain even partial access to Google's user interaction data, the gap between Google and alternatives like Bing, Brave Search, or AI-native search tools narrows. The longer-term story is bigger: SEO will no longer be a single-platform game. As competing engines gain access to Google's scale-driven insights, authority, brand visibility, and customer trust matter as much as keyword optimization.

Practitioner Playbook: Building for Q* and P*

Every revelation from the trial and the Reid affidavit points toward the same actionable framework. Stop thinking about ranking factors as a checklist. Start thinking about building evidence for Quality (Q*) and Popularity (P*). For Quality (Q):*

Author entities matter.

The confirmation of attributes like siteAuthority, isAuthor, and authorReputationScore signals a shift toward entity-centric evaluation. Optimization must extend beyond the page to the entities responsible for the content-comprehensive author biographies, consistent bylines, and mentions from reputable external sites.

Content effort is measurable. The leaked API revealed a contentEffort attribute that uses an LLM to estimate human labor invested in a page. Generic, thin content scores poorly. Original research, proprietary data, and first-person experience score well.
PageRank still counts.

PageRank has evolved from a simple measure of link volume to a foundational link equity signal that, combined with other trust and quality factors, contributes to a site's overall Q* score. Links from authoritative, topically relevant sources remain essential-they're just one ingredient in a broader recipe now. For Popularity (P):*

Earn the last longest click. Design every page so users find what they need, stay engaged, and don't bounce back to the SERP. This single behavioral metric may be the strongest positive signal in NavBoost's architecture.
Monitor crawl frequency as a health indicator. Crawl frequency isn't random-Google uses engagement and popularity to decide which pages to crawl more often. High spam scores can reduce crawl priority and impact visibility. A sudden drop in crawl activity is an early warning signal.
Build brand search demand. Google looks at how often people search for a site by name or brand and how often they click on it when it appears in search results. By comparing these signals, Google can assess whether a site is both well-known and genuinely useful.

The combined picture is clear. Google didn't build a mystery. It built a system that rewards trustworthy content visited by real people who stay, engage, and return. The antitrust proceedings and the Reid affidavit have given us the vocabulary to name every component of that system-from Q* to NavBoost to RankEmbed BERT to the spam scores attached to every DocID. The question isn't whether we understand Google's ranking architecture anymore. The question is whether we'll build sites that deserve to rank within it.

Ready to optimize for the AI era?

Get a free AEO audit and discover how your brand shows up in AI-powered search.

Get Your Free Audit

The Two Master Signals: Quality (Q*) and Popularity (P*)