cd /news/large-language-models/how-ai-engines-actually-decide-what-… · home topics large-language-models article
[ARTICLE · art-35333] src=dev.to ↗ pub= topic=large-language-models verified=true sentiment=· neutral

How AI engines actually decide what to cite (ChatGPT, Perplexity, Gemini, AI Overviews)

A developer analyzed how four major AI search engines—ChatGPT, Perplexity, Gemini, and Google's AI Overviews—retrieve and cite sources, revealing distinct citation patterns. ChatGPT cites only about 15% of browsed pages and names brands three times more often than linking them. Perplexity heavily relies on community content, with Reddit accounting for ~47% of top citations. Gemini uses Google's live index and Knowledge Graph, with only 38% of AI Overview citations coming from top-10 results. AI Overviews employs query fan-out, pulling most citations from below position #1, and has the weakest freshness bias among the engines.

read2 min views1 publishedJun 21, 2026

Everyone keeps asking "is SEO dead." Wrong question.

AI search doesn't show ten blue links. It generates one answer and names a few brands. If you're not in that answer, you don't exist for that query. So the real question is: how do these engines decide who to name?

I went down a rabbit hole on how four of them actually retrieve and cite sources. Here's what's true in 2026, with real numbers.

ChatGPT answers in two modes. Default mode answers from trained-in memory, no live web. Search mode browses and attaches citations. The key fact: when it browses, it cites only about 15% of the pages it pulls (AirOps study of 548k pages). And it names brands roughly 3x more often than it links them.

So two things get you in:

Perplexity does live retrieval and grounds every answer in sources. Its defining trait: it leans on community content hard. One 2025 study found Reddit was its most-cited source, ~47% of top citations. It also rewards answer-first pages, because its reranker scores for how cleanly it can extract a passage. A page can rank #1 on Google and never get cited here if the answer is buried.

Gemini is the only major assistant running on Google's own live index plus the Knowledge Graph. So classical SEO is the floor, not optional. The twist: ranking #1 isn't enough anymore. Only about 38% of Google's AI Overview citations come from the top 10 results, down from ~76% a year earlier. It pulls from deeper now, via sub-queries.

AI Overviews uses "query fan-out" - it splits your question into 8-12 sub-queries and pools the results. Most citations come from below position #1 (roughly 63% from below the top 10). And counterintuitively, it has the weakest freshness bias of the major engines. Established, authoritative pages keep getting cited even without recent updates, which is the opposite of ChatGPT and Perplexity.

I got tired of checking this by hand, so I built FixAEO - a free tool to see how AI engines describe and recommend your brand across 8 engines, plus a free llms.txt validator. Sharing in case it saves you the manual prompting.

What have you noticed about getting cited by AI? Curious if others are seeing the same patterns.

── more in #large-language-models 4 stories · sorted by recency
artsandculture.google.com · · #large-language-models
See in CMYK
── more on @chatgpt 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-ai-engines-actua…] indexed:0 read:2min 2026-06-21 ·