The Complete Guide to Google BERT and MUM

Key takeaways

BERT and MUM are transformer-based natural language models Google uses to understand search queries and content. BERT, applied to Search from October 2019, reads words bidirectionally to grasp context and intent. MUM, introduced in May 2021, is multimodal and multilingual, trained across 75 languages, and powers specific features rather than general ranking. Both reward content written naturally for people.

BERT (Bidirectional Encoder Representations from Transformers) launched in Google Search on October 25, 2019, initially affecting about 1 in 10 English US queries.
BERT reads words in both directions at once, so it understands context and nuance like prepositions instead of treating queries as bags of keywords.
MUM (Multitask Unified Model), introduced May 18, 2021, is roughly 1,000 times more powerful than BERT, multimodal, and trained across 75 languages.
Google states MUM is not used for general ranking; it powers specific features such as COVID-19 vaccine information and featured snippet improvements.
Both models reward clear, natural, intent-matching content over keyword stuffing or exact-match phrasing.

What BERT and MUM are

Definition

BERT stands for Bidirectional Encoder Representations from Transformers. It is a natural language processing model built on the transformer architecture, introduced in a 2018 research paper by Google AI researchers Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Google describes BERT in its ranking systems documentation as an AI system that allows it to understand how combinations of words express different meanings and intent.

MUM stands for Multitask Unified Model. Google introduced it at Google I/O on May 18, 2021, describing it as an AI system capable of both understanding and generating language. MUM is built on the T5 text-to-text framework, is multilingual and multimodal, and is roughly 1,000 times more powerful than BERT.

Both are part of how Google moved from matching strings of text to understanding meaning. They do not replace each other. BERT is applied broadly across query understanding and ranking, while MUM powers a narrower set of specific features. For the wider context, see our pillar guide to Google's ranking algorithms.

BERT and MUM at a glance

BERT stands for: Bidirectional Encoder Representations from Transformers
MUM stands for: Multitask Unified Model
BERT in Search: October 25, 2019
MUM introduced: May 18, 2021, at Google I/O
Relative power: MUM about 1,000 times more powerful than BERT
MUM languages: Trained across 75 languages
MUM modality: Multimodal: text and images
MUM and ranking: Not used for general ranking

How BERT works: bidirectional context

Before BERT, many language models read text in one direction, left to right or right to left. BERT reads the whole sentence at once, considering each word in relation to all the other words around it. This is what bidirectional means: context comes from both sides simultaneously.

BERT was pre-trained on two tasks. The first, masked language modeling, hides about 15 percent of the words in a sentence and asks the model to predict them from the surrounding context. The second, next sentence prediction, teaches the model whether one sentence logically follows another. Together these tasks teach BERT how language fits together.

The practical payoff is nuance. Small words such as for, to, and no can completely change what a query means. Google's launch example was the search 2019 brazil traveler to usa need a visa, where the word to signals direction of travel. Older systems often ignored it; BERT understands it matters. This is the same shift toward meaning that RankBrain began in 2015.

How MUM works: multimodal and multilingual

MUM extends the transformer approach in three directions. It is multitask, trained to do many things at once. It is multilingual, trained across 75 languages, so it can learn from information in one language and apply that knowledge in another. And it is multimodal, meaning it understands information across text and images, with Google stating it can expand to more formats such as video and audio over time.

The capability Google highlighted was complex, multi-step questions. Its example involved a hiker who had climbed Mount Adams and wanted to prepare for Mount Fuji, asking what to do differently. Answering that well requires connecting elevation, trail conditions, gear, and seasonal weather, the kind of reasoning that normally takes a person several separate searches.

Importantly, MUM is not a general ranking system. Google states in its ranking systems guide that MUM is not currently used for general ranking, but rather for specific applications.

What changed for content and intent matching

The shift from keyword matching to meaning matching has a direct effect on how content should be written. Exact-match keywords matter less because Google can now connect a query to relevant content even when the wording differs. Synonyms, paraphrases, and natural phrasing are understood.

This rewards a few specific habits:

Write for the question behind the query. Match the searcher's actual intent, not just their literal words.
Use natural language. Awkward keyword stuffing now reads as low quality and adds no ranking benefit.
Cover topics thoroughly. Models that understand context reward content that fully answers a question, including the related sub-questions a person would naturally ask next.
Be precise. Because the models read nuance, clear and accurate phrasing helps Google match your page to the right queries.

None of this is a trick. Both BERT and MUM are designed to surface content that genuinely answers people well, which is why Google has consistently said the best response is to write helpful content for users rather than for algorithms. That principle is enforced more directly by Google's Helpful Content system.

BERT versus MUM at a glance

The two models are often discussed together but serve different roles:

Scope. BERT is applied broadly to query understanding and ranking. MUM powers specific features and is not used for general ranking.
Power. Google describes MUM as roughly 1,000 times more powerful than BERT.
Modality. BERT works with text. MUM is multimodal, understanding text and images with more formats planned.
Language. BERT in Search expanded to over 70 languages. MUM was trained across 75 languages and can transfer knowledge between them.
Function. BERT understands language. MUM both understands and generates language.

For day-to-day content work, BERT is the model that most directly shapes how your pages are matched to queries, while MUM signals the direction Google is heading: richer, cross-language, cross-format understanding.

History of BERT and MUM: a timeline

Both models grew out of the 2017 transformer breakthrough, with BERT reaching Search in 2019 and the far larger MUM arriving in 2021.

2017
Transformer architecture published

Google researchers publish the Attention Is All You Need paper, introducing the transformer architecture that BERT and MUM are both built on.
2018
BERT research paper released

Devlin, Chang, Lee, and Toutanova publish BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding on October 11, 2018.
2019
BERT applied to Google Search

On October 25, 2019, Google begins using BERT for ranking and featured snippets, initially affecting about 1 in 10 US English queries, then expanding to over 70 languages by December.
2020
BERT used on nearly all English queries

By October 2020, Google states that almost every English-language query is processed by a BERT model.
2021
MUM introduced at Google I/O

On May 18, 2021, Google introduces MUM, a multimodal, multilingual model roughly 1,000 times more powerful than BERT, trained across 75 languages.
2021
MUM's first real-world application

Google uses MUM to identify over 800 variations of COVID-19 vaccine names across more than 50 languages, then rolls out features like Things to know in September 2021.

What BERT and MUM reward

Neither model is a dial you can turn. Instead, each rewards a particular quality in your content. Knowing what they respond to helps you write pages that match the right queries.

Content qualities BERT and MUM respond to
What they reward	Why it matters
Query intent over exact keywords	BERT lets Google match content to the meaning behind a query, so pages that answer the underlying question rank even when wording differs from the search.
Natural, contextual phrasing	Bidirectional reading means small connecting words and sentence structure carry meaning, so clear natural prose helps Google understand a page correctly.
Topical depth and related sub-questions	Context-aware models reward content that thoroughly answers a topic, including the follow-up questions a searcher would naturally ask.
Multilingual coverage	MUM can transfer knowledge across its 75 trained languages, raising the value of accurate information regardless of the language it was originally published in.
Multimodal content	MUM understands text and images together, signaling growing value in pairing clear writing with relevant, well-described visual content.

The practical takeaway is that writing clearly and answering fully is the optimization. There is no separate lever for these models beyond the quality of the content itself.

How to optimize for BERT and MUM

You cannot optimize for these models directly; you align with them by writing clear, natural content that fully answers the intent behind a query and covers the related questions a searcher would ask next.

Write to satisfy the intent behind the query, not just the literal keyword
BERT matches content to meaning, so pages that answer the real question outperform pages stuffed with exact-match phrases.
Use clear, natural language and keep sentences precise
Bidirectional reading means context words and structure carry meaning, so natural phrasing helps Google interpret your page accurately.
Cover a topic thoroughly, including the obvious follow-up questions
Context-aware models reward content that fully resolves a searcher's need, which also positions pages for featured snippets.
Add a clear FAQ or question-and-answer section in plain language
Conversational, well-structured answers map cleanly to how BERT interprets natural-language and long-tail queries.
Pair text with relevant, well-captioned images where it helps the reader
MUM is multimodal, and good visual content with descriptive alt text supports the direction Google's understanding is moving.
Publish accurate information without obsessing over keyword density
Both models reward genuinely helpful content, and keyword stuffing now reads as low quality with no ranking benefit.

BERT and MUM myths vs. reality

Few SEO topics attract as much confusion as Google's language models. Here are the most common myths and what is actually true.

Myth BERT and MUM are ranking factors you can optimize directly.

Reality They are language understanding models, not knobs. You cannot optimize for them directly; you align with them by writing clear, helpful, intent-matching content.

Myth MUM ranks all of Google's search results.

Reality Google explicitly states in its ranking systems guide that MUM is not currently used for general ranking. It powers specific applications such as COVID-19 vaccine information and featured snippet improvements.

Myth You need to repeat keywords more to help BERT.

Reality The opposite is true. BERT understands synonyms and context, so keyword stuffing adds no benefit and can signal low-quality content.

Myth BERT replaced Google's other ranking systems.

Reality BERT is one of several systems in Google's ranking stack. It improves language understanding but works alongside many other signals and systems.

Myth MUM made BERT obsolete.

Reality They coexist and serve different roles. BERT broadly handles query understanding and ranking, while MUM is applied to a narrower set of specific, often multimodal features.

Frequently asked questions

BERT stands for Bidirectional Encoder Representations from Transformers. It is a natural language processing model Google introduced in a 2018 research paper and began using in Search ranking and featured snippets on October 25, 2019, to better understand the meaning and intent behind queries.

MUM stands for Multitask Unified Model. Google introduced it on May 18, 2021, describing it as an AI system that both understands and generates language. It is multimodal, multilingual across 75 languages, and roughly 1,000 times more powerful than BERT, built on the T5 framework.

BERT understands text and is applied broadly to query understanding and ranking. MUM is multimodal, understanding both text and images, is trained across 75 languages, can generate language, and is far more powerful. However, Google states MUM is not used for general ranking, only specific applications.

No. Google states in its ranking systems guide that MUM is not currently used for general ranking. Instead, it powers specific applications, such as identifying over 800 COVID-19 vaccine name variations across more than 50 languages and improving certain features like featured snippet callouts.

BERT reads every word in a query in relation to all the other words at once, in both directions. This bidirectional context lets it grasp how small words like prepositions change meaning, so it matches a query to relevant content based on intent rather than exact keyword matching.

You cannot optimize for them directly. The reliable approach is to write clear, natural content that fully answers the intent behind a query, covers related sub-questions, and avoids keyword stuffing. This aligns with what both models are designed to reward: genuinely helpful content for people.

Google began applying BERT to Search on October 25, 2019, initially affecting about 1 in 10 US English queries for ranking and featured snippets. It expanded to over 70 languages by December 2019, and by October 2020 nearly every English-language query was processed by a BERT model.

No. Because BERT understands context, synonyms, and natural phrasing, repeating keywords adds no ranking benefit and can read as low-quality content. The better approach is writing naturally and precisely so Google can accurately match your page to relevant queries.

The bottom line

Bottom line

BERT and MUM moved Google from matching strings of text to understanding meaning. BERT reads queries bidirectionally and now touches nearly every English search, while the far larger, multimodal MUM powers specific features rather than general ranking. Neither is a dial you can turn. The way to win with both is the same: write clearly and naturally, answer the real intent behind a query, cover the questions people ask next, and skip the keyword stuffing.

About the author

Capconvert Editorial Team

Search and Content Strategy at Capconvert

The Capconvert Editorial Team writes about search engine algorithms, content strategy, and generative engine optimization. Our guides distill primary sources, including Google's own documentation and research papers, into clear, practical explanations for marketers and site owners.

What BERT and MUM are

How BERT works: bidirectional context

How MUM works: multimodal and multilingual

What changed for content and intent matching

BERT versus MUM at a glance

History of BERT and MUM: a timeline

What BERT and MUM reward

How to optimize for BERT and MUM

BERT and MUM myths vs. reality

Frequently asked questions

What does BERT stand for?

What does MUM stand for?

How is MUM different from BERT?

Does MUM rank search results?

How does BERT understand search queries?

How should I optimize my content for BERT and MUM?

When did Google start using BERT in Search?

Is keyword stuffing still useful with BERT?

The bottom line

Related guides

About the author

References