Google’s AI Overviews are accurate most of the time, but at search-engine scale, their error rate still turns into millions of incorrect answers every hour.

Others are reading now

AI is making work more exhausting—not less, researchers warn

Finnish Coast Guard warns of drifting oil ships

Google’s AI-generated search summaries have improved, but at the scale of Google Search, even a relatively small error rate becomes a major information problem.

According to a New York Times experiment conducted with AI startup Oumi, Google AI Overviews answered correctly around 90% of the time. That also means roughly 1 in 10 answers was wrong.

Applied to the volume of searches Google handles, that error rate translates into millions of incorrect AI-generated answers every hour.

Accuracy has improved, but scale changes the stakes

AI Overviews use Google’s Gemini models to generate short answers directly inside search results.

Also read

Trump takes dramatic step after 90-minute call with Putin

Joe Biden takes aim at Donald Trump: “What a loser”

The system has reportedly improved since earlier testing, rising from around 85% accuracy with Gemini 2.5 to about 91% after the move to Gemini 3.

But the problem is not only whether the system is improving. It is that even a 90% accuracy rate still leaves a large number of false answers when deployed across one of the world’s most-used information platforms.

Google disputes the methodology

Google has pushed back on the findings, arguing that the test does not reflect how people actually use Search.

The company also criticized the SimpleQA benchmark used in the experiment, saying it may contain inaccuracies. Google says it uses its own more tightly verified version of the test when evaluating performance.

The company’s position is that the study overstates the real-world problem.

Speed, cost and accuracy remain in tension

AI Overviews do not rely on one single model for every answer.

Google has said the system selects what it considers the most relevant model for each query. More powerful models may produce better results, but they are also slower and more expensive to run at search-engine scale.

That leaves Google balancing accuracy against speed, cost and user experience.

The trust problem is bigger than the error rate

A 90% success rate may sound strong by AI industry standards, but Search is different from a chatbot demo or internal benchmark.

When Google places an AI answer at the top of a results page, many users may treat it as authoritative and never click through to the original sources.

That makes each incorrect answer more consequential. Google acknowledges the risk with its own warning that AI can be wrong and users should check information again.

The issue is not that AI Overviews always fail. It is that they are trusted at a scale where even occasional failure becomes massive.

Sources: New York Times experiment with Oumi, Google statements, Ars Technica

This article is made and published by Asger Risom, who may have used AI in the preparation

Google AI Overviews still produce millions of incorrect answers every hour

Others are reading now

AI is making work more exhausting—not less, researchers warn

Finnish Coast Guard warns of drifting oil ships

Accuracy has improved, but scale changes the stakes

Also read

Trump takes dramatic step after 90-minute call with Putin

Joe Biden takes aim at Donald Trump: “What a loser”

Google disputes the methodology

Speed, cost and accuracy remain in tension

The trust problem is bigger than the error rate

Also read

Russia Builds Massive New Military Base Near NATO Border

General Motors hits the brakes on massive electric truck rollout

Trump presents himself as a peacemaker: Opens the door to a Russia visit