We studied citation patterns across ChatGPT, Perplexity, and Google AI Overviews — drawing on analysis of 680 million citations, 129,000 domains, and our own controlled experiment. The findings challenge most of what the SEO industry assumes about AI visibility.
Three platforms, three different games
The first thing to understand: ChatGPT, Perplexity, and Google AI Overviews don't work the same way. They use different retrieval systems, weight different signals, and favor different source types. A strategy that works for one can be irrelevant for another.
| Factor | ChatGPT | Perplexity | Google AI Overviews |
|---|---|---|---|
| How it finds sources | Bing index + GPTBot crawler | Own 200B+ URL index, real-time per query | Google Search index (top-10 required) |
| #1 cited source type | Wikipedia (47.9%) | Reddit (46.7%) | YouTube (23.3%) |
| Content depth preference | Depth and breadth of coverage | Comprehensive + regularly updated | Semantically complete coverage |
| Freshness sensitivity | Moderate | Extreme (drops after 2-3 days) | Moderate (verified against databases) |
| New sites can win? | Yes — structure and depth matter most | Yes — Reddit presence is the lever | Hard — requires existing Google top-10 ranking |
Sources: Averi.ai (680M citations), SE Ranking (129K domains), Princeton GEO study (KDD 2024), Snezzi platform guide.
That last row is the most important. ChatGPT is the most meritocratic platform for new entrants — a five-week-old domain can earn citations if the content is genuinely superior. Google AI Overviews is the hardest — you need traditional search rankings first. Perplexity sits in between, with Reddit community presence as the key differentiator.
What ChatGPT actually rewards
ChatGPT uses Bing's index plus its own GPTBot crawler to find sources. It doesn't require Google rankings. This makes it the most accessible platform for new or small sites — and the one where content quality matters most directly.
Structural depth: the single biggest factor
The content structure that earns ChatGPT citations looks very different from content optimized for featured snippets. Short, listy content — the kind SEO agencies have produced for the last decade — systematically underperforms. ChatGPT extracts information in coherent chunks, which means each section needs enough substance to stand alone as a quotable source.
Most websites fail this test badly. Individual sections are too thin to provide extractable context. The structural pattern that earns citations requires genuine depth per topic — not padding, but complete treatment of each point before moving on. This is structural work, not word count work, and it's the gap we close first for every client.
Depth compounds
Comprehensive coverage consistently earns more citations than thin content — by a significant margin. But this isn't about hitting a word count. ChatGPT rewards genuine coverage: sub-topics addressed, edge cases handled, nuances that thinner content glosses over. An article that repeats itself doesn't outperform a tighter piece that goes deeper on fewer points.
Statistics with attribution
Pages that include statistics with named methodology and clear attribution earn meaningfully higher visibility. Not "many experts agree" — but "according to Bank Indonesia's 2025 annual report, foreign property investment grew 12.3% year-over-year." AI engines need verifiable claims. Named sources give them something to cross-reference.
AI engines don't trust authoritative-sounding prose. They trust verifiable claims with named sources. The distinction matters.
What doesn't work on ChatGPT
Some findings are counter-intuitive. Question-style headings ("What are the costs of buying property?") actually earn fewer ChatGPT citations than straightforward headings ("Property Transaction Costs in Bali"). FAQ schema shows a similar slight penalty. And keyword-optimized URLs underperform broader, topic-describing URLs.
These are minor effects. We still use question headings and FAQ schema because they help on Perplexity and Google AI. But it's a reminder that what works for Google search doesn't automatically work for AI citation.
What Perplexity rewards
Perplexity operates differently from ChatGPT in almost every way. It maintains its own index of 200 billion+ URLs and searches the web in real-time for every query. Two signals dominate its citation behavior.
Reddit is nearly half of everything
46.7% of Perplexity's top-10 citations come from Reddit. Not from major publications, not from .gov domains, not from corporate websites. Reddit.
This reframes Reddit from "optional social distribution" to "critical Perplexity infrastructure." A business that wants Perplexity visibility needs a genuine Reddit presence — not promotional posts, but expert commentary in relevant subreddits. The content on your website matters, but your community presence determines whether Perplexity finds and trusts you.
Freshness is extreme
Perplexity's visibility drops noticeably within 2-3 days without content refresh. This is the most freshness-sensitive platform we've studied. "Last updated" timestamps, monthly data refreshes, and regular content additions aren't nice-to-haves — they're the difference between being cited and being forgotten.
For businesses in fast-moving industries (real estate, legal, finance), this actually works in their favor. A competitor with a static two-year-old guide will be overtaken by a fresher, regularly updated resource within weeks.
What Google AI Overviews rewards
Google AI Overviews appears at the top of Google search results, making it the most valuable citation placement. It's also the hardest to earn.
The reason: 92% of Google AI citations come from pages already ranking in Google's top 10. Google AI Overviews doesn't discover new sources — it promotes existing winners. For a new site, this means Google AI is a medium-term goal (3-6 months of building traditional SEO signals first). For a site already ranking well, it's the highest-value target.
Schema markup correlates strongly with Google AI citations — pages with comprehensive structured data (FAQPage, HowTo, Article, Organization) consistently outperform those without. E-E-A-T signals (expertise, experience, authoritativeness, trustworthiness) are near-mandatory: nearly all citations come from sources with strong E-E-A-T indicators.
The universal factors
Despite their differences, all three platforms agree on several things. These are the factors we build into every piece of content, regardless of which platform we're targeting.
What works everywhere
Hierarchical headings. Strict H2-to-H3 flow. Consistently high citation impact across all platforms. The most basic structural requirement, and the one most sites get wrong.
Comparison tables. Strong preference across all three platforms — especially Google AI. If you're comparing options, use a table, not prose.
Original research or case studies. Significant visibility advantage across platforms. AI engines have an effectively unlimited supply of rewritten secondary content. Original data is scarce, and scarcity gets cited.
Cross-platform brand presence. Mentions of your brand across multiple platforms correlate with citation rate more strongly than backlinks alone. It's not about link building — it's about existing as a recognizable entity across the web.
What this means for your strategy
The platform differences create a natural sequencing for most businesses.
If you're starting from scratch — no domain authority, no rankings, no brand — ChatGPT is your first target. It rewards content quality regardless of domain age. Build the best resource on your topic. Cite primary sources. Structure for extraction.
Once you have a content base, invest in Reddit presence to open up Perplexity. Genuine community participation, not promotional linking. Perplexity's real-time search means your Reddit activity and your website content reinforce each other.
As your domain matures and organic rankings build (partly driven by the authority you've accumulated through AI citations and off-site presence), Google AI Overviews becomes achievable. Add comprehensive schema markup. Build E-E-A-T signals. This is where the highest-volume citation opportunities live — but you need the foundation first.
Knowing the landscape is not the same as navigating it
The frameworks above describe how AI citation works in general. What they don't account for is your specific competitive landscape — which competitors are already entrenched in which platforms, which citation opportunities exist in your niche that don't in others, and where the gaps are that your content can actually win.
The other thing they don't account for is measurement. Most businesses that try to "optimize for AI" don't have a way to know if it's working. Which citations are driving actual traffic? Which are driving conversions? Which content changes moved the needle and which didn't? Without that feedback loop, you're optimizing blind.
What we've found is that AI citation is not a one-time structural fix. It's an ongoing intelligence operation — tracking citation patterns over time, identifying what's shifting across platforms, and adapting the content strategy based on data that accumulates specifically for your business in your niche. That intelligence is what compounds. The principles are public. The data isn't.
If you want to see what that looks like for your business — where AI currently mentions your competitors, where it's silent about you, and where the real opportunities are — we'll run the audit free.