Etavrian
keyboard_arrow_right Created with Sketch.
News
keyboard_arrow_right Created with Sketch.

New 129k-site study reveals what quietly boosts your brand citations in ChatGPT

Reviewed:
Andrii Daniv
11
min read
Nov 27, 2025
Minimalist ChatGPT citation control hub chatbot bubble report panel URL cards success toggle reviewer

A recent study by SE Ranking of 129,000 domains provides one of the clearest quantitative views so far on which site and content attributes correlate with being cited as a source in ChatGPT answers. The findings align more with classic SEO and authority-building work than with narrow AI-specific tactics.

Executive snapshot

Key correlations from the SE Ranking analysis include:[S1]

  • Domains with the largest backlink profiles (350,000+ referring domains) averaged up to 8.4 ChatGPT citations, versus 1.6-1.8 for sites with modest backlink profiles. A sharp step-up occurred around 32,000 referring domains.
  • High Domain Trust scores (97-100) correlated with 8.4 citations, compared with 1.6 citations for scores below 43. Page Trust plateaued beyond a moderate level.[S1]
  • Long, updated, data-rich content performed best: pages over 2,900 words, updated within 3 months, and containing 19+ statistics tended to reach 5.1-6 citations, versus 2.8-3.6 for short or stale pages.[S1]
  • Strong community and review presence mattered: brands heavily mentioned on Reddit and Quora and listed on multiple review platforms saw 4.6-7 citations, versus 1.7-1.8 for weakly mentioned brands.[S1]
  • Speed and broadly topical URLs correlated with higher citations, while heavily keyword-stuffed URLs and titles, FAQ schema, and LLMs.txt showed little or negative impact.[S1]

For marketers, the data suggests that established SEO, authority, and brand-building activities align directly with higher ChatGPT visibility, while narrow "AI SEO" hacks show limited return.

New Data: Top Factors Influencing ChatGPT Citations
Visualization of the top factors that correlate with ChatGPT source citations in SE Ranking's large-scale study.[S1]

ChatGPT citation ranking factors: method and source notes

SE Ranking analyzed 129,000 unique domains and 216,524 pages across 20 niches to identify factors associated with how often ChatGPT cites a domain as a source.[S1] The study modeled "citation likelihood" as the number of times a domain appeared in ChatGPT outputs, then correlated that with dozens of domain-, page-, content-, technical-, and brand-level variables.

Key points about the dataset and methodology:[S1]

  • Scope: Broad mix of industries and site types, including commercial, government (.gov), and educational (.edu) domains.
  • Metrics analyzed: Referring domains, Domain Trust, Page Trust, organic traffic (total and homepage), average Google rankings, content length and structure, update recency, use of expert quotes and statistics, FAQ presence and schema, social and review mentions, page speed metrics (FCP, Speed Index, INP), URL and title keyword matching, and presence of LLMs.txt.
  • Modeling: Correlation and feature-importance analysis to rank which factors best predicted citation count.
  • Timeframe and model version: The study does not specify which ChatGPT version or time period the citation data comes from.

Limitations:[S1]

  • The analysis is correlational, not causal; many factors move together (for example, large brands typically have more links, traffic, and mentions).
  • Proprietary scores such as Domain Trust and Page Trust are model-specific estimates of authority, not standardized industry metrics.
  • Some variables (such as the impact of FAQs) may be confounded by the type of pages that usually include them (for example, support versus deep editorial content).

Findings: SEO and authority signals linked to ChatGPT citations

The SE Ranking study identified classic SEO authority signals - backlinks and domain-level trust metrics - as the strongest predictors of ChatGPT citations.[S1]

Backlink quantity and diversity

  • Domains with up to 2,500 referring domains averaged 1.6-1.8 ChatGPT citations.
  • Domains with 350,000+ referring domains averaged 8.4 citations.[S1]
  • A marked acceleration occurred around 32,000 referring domains, where citations nearly doubled from 2.9 to 5.6.[S1]

Domain- vs. page-level authority

  • Domains with Domain Trust <43 averaged 1.6 citations.
  • Scores of 91-96 corresponded to roughly 6 citations.
  • Scores of 97-100 reached 8.4 citations on average.[S1]
  • Page Trust mattered less once past a moderate threshold: any page with a Page Trust of 28+ saw roughly the same citation rate (~8.3), suggesting ChatGPT may lean more on overall domain reputation than individual page authority.[S1]

TLDs and "trusted" zones

  • .gov and .edu domains averaged 3.2 citations, compared with 4.0 for non-"trusted zone" domains.[S1]
  • The authors concluded that domain type alone did not guarantee preference; content quality and value appeared more predictive than TLD.[S1]

Organic traffic and Google rankings

  • Domains with <190,000 monthly visitors averaged 2-2.9 citations, regardless of whether they had 20 or 20,000 organic visitors.[S1]
  • Citation counts only rose meaningfully once traffic exceeded 190,000 monthly visitors; domains with 10+ million visitors averaged 8.5 citations.[S1]
  • High homepage traffic mattered in particular: sites with ≥7,900 organic visitors to the homepage were most likely to be cited.[S1]
  • Pages ranking between positions 1-45 in Google averaged 5 citations, while those in positions 64-75 averaged 3.1 citations.[S1]

Taken together, these results suggest ChatGPT's notion of "source-worthy" content converges with signals that also support strong organic search performance.[S1]

Findings: content depth, structure, and freshness in ChatGPT citations

Content quality indicators - length, structure, expertise signals, and recency - showed consistent relationships with ChatGPT citations.[S1]

Length and section structure

  • Articles under 800 words averaged 3.2 citations.
  • Articles over 2,900 words averaged 5.1 citations.[S1]
  • The best-performing pages organized content into sections of roughly 120-180 words between headings, averaging 4.6 citations.[S1]
  • Very short sections (<50 words) between headings correlated with fewer citations (2.7), suggesting a preference for well-developed topical segments over ultra-fragmented layouts.[S1]

Expertise and data density

  • Pages featuring expert quotes averaged 4.1 citations, versus 2.4 for those without.[S1]
  • Content with ≥19 statistical data points averaged 5.4 citations, compared to 2.8 for pages with minimal data.[S1]

Freshness and updates

  • Pages updated within the previous 3 months averaged 6 citations.
  • Older, unrefreshed content averaged 3.6 citations.[S1]

FAQs and heading style

  • Pages with FAQ sections recorded 3.8 citations, slightly lower than 4.1 for pages without FAQs.[S1]
  • However, the predictive model treated the absence of FAQ sections as a negative indicator, likely because FAQs tend to appear on simpler support pages with lower overall citation potential.[S1]
  • Headings written as questions (for example, "What is...?") underperformed straightforward topical headings, averaging 3.4 citations versus 4.3.[S1]

Overall, the data points to a preference for substantial, structured, evidence-backed pages that are actively maintained.

Findings: brand, social, and review signals influencing ChatGPT references

Off-page brand indicators - social discussion and review presence - were strongly associated with higher ChatGPT citation counts.[S1]

Discussion platforms: Reddit and Quora

  • Domains with minimal Quora presence (up to 33 mentions) averaged 1.7 citations.
  • Domains with roughly 6.6 million Quora mentions averaged 7 citations.[S1]
  • On Reddit, domains with over 10 million mentions averaged 7 citations, versus 1.8 for those with minimal activity.[S1]

The authors suggested that for smaller or newer sites, discussion-platform activity can partially substitute for the authority signals that large brands gain from extensive backlink and traffic profiles.[S1]

Review platforms

  • Presence on platforms such as Trustpilot, G2, Capterra, Sitejabber, and Yelp correlated with higher citations.[S1]
  • Domains listed on multiple review platforms earned 4.6-6.3 citations on average.
  • Domains absent from such platforms averaged 1.8 citations.[S1]

These results indicate that third-party validation - not just links - appears to feed into whatever signals ChatGPT uses to recognize trustworthy brands.

Findings: technical performance and URL/title patterns in ChatGPT source selection

Technical performance and URL/title patterns also showed measurable relationships with citation frequency.[S1]

Page speed metrics

  • Pages with First Contentful Paint (FCP) <0.4 seconds averaged 6.7 citations, while those with FCP >1.13 seconds averaged 2.1 citations.[S1]
  • A similar pattern emerged for Speed Index: values below 1.14 seconds corresponded to stable citation levels, while metrics above 2.2 seconds saw steep declines.[S1]
  • Interaction to Next Paint (INP) behaved differently: pages with very fast INP (<0.4 seconds) actually received fewer citations (1.6) than those with moderate INP (0.8-1.0 seconds), which averaged 4.5 citations.[S1]
  • The authors proposed that extremely fast INP often signals very simple or static pages, which may lack the depth ChatGPT prefers when assembling answers, even if they load quickly.[S1]

URL and title optimization patterns

  • URLs that broadly described the topic rather than tightly matching a target keyword correlated with more citations:
    • Low semantic relevance between URL and target keyword (0.00-0.57) → 6.4 citations on average.
    • High semantic relevance (0.84-1.00) → 2.7 citations.[S1]
  • Titles showed the same pattern:
    • Titles with low keyword matching averaged 5.9 citations.
    • Titles with high keyword optimization averaged 2.8 citations.[S1]

The study authors concluded that ChatGPT appears to favor URLs and titles that signal broad topical coverage rather than narrow keyword targeting.[S1]

Underperforming AI SEO tactics for ChatGPT visibility

Several tactics often promoted as "AI optimization" showed weak or negative correlation with ChatGPT citations in this dataset.[S1]

FAQ schema markup

  • Pages with FAQ schema averaged 3.6 citations.
  • Pages without FAQ schema averaged 4.2 citations.[S1]

This suggests that simply adding FAQ structured data does not, by itself, make a page more attractive as a ChatGPT source. It may also reflect the typical use of FAQ schema on brief support pages rather than deeper resources.

LLMs.txt and outbound links

  • The presence of LLMs.txt files showed negligible impact on citation likelihood.[S1]
  • Outbound links to high-authority sites also had minimal observable effect on citation counts.[S1]

In this modeling, AI-oriented technical tweaks had far less explanatory power than classic authority, traffic, content quality, and brand signals.

Interpretation and implications for marketers

Interpretation (likely): core SEO authority work supports ChatGPT visibility

The dominant weight of backlinks, Domain Trust, and organic traffic indicates that building a strong, reputable domain through links, content, and rankings is highly aligned with being cited by ChatGPT.[S1] Businesses already investing in search visibility and brand authority are likely moving in the right direction for AI visibility as well, even without explicit "AI SEO" projects.

Interpretation (likely): depth, structure, and recency increase source-worthiness

The positive correlations with long-form, well-structured, data-rich, and recently updated content suggest that ChatGPT favors comprehensive resources over thin or outdated pages.[S1] For content roadmaps, this supports:

  • Consolidating scattered articles into substantial guides.
  • Structuring content into clear sections of roughly 120-180 words, backed by data and expert commentary.
  • Refreshing high-value pages at least quarterly in fast-moving topics.

Interpretation (likely): brand presence across the web matters, not just on Google

Reddit and Quora mentions and review-site listings strongly track with citations, which indicates that ChatGPT's training and retrieval favor brands that appear across multiple independent sources.[S1] For smaller brands with limited link equity, sustained participation in relevant communities and review ecosystems looks like a practical way to build AI-recognizable authority.

Interpretation (tentative): over-optimized URLs and titles may signal narrow or lower-value content

The underperformance of highly keyword-matched URLs and titles could mean that ChatGPT's systems are tuned toward resources that cover broader topics or user needs, rather than pages engineered for a single query.[S1] Marketers may benefit from naming pages by topic or user problem, not only by exact-match keywords.

Interpretation (tentative): ultra-simple pages and FAQ-heavy layouts are weak AI sources

Very fast INP and heavy use of FAQs and schema appear most common on lean support content with limited explanatory depth.[S1] These pages remain valuable for user service and traditional search features but are less likely to be surfaced by ChatGPT as authoritative citations.

Interpretation (speculative): ChatGPT and Google may share upstream authority signals

The alignment between Google rankings, Domain Trust, and ChatGPT citations suggests that both systems may draw value from similar underlying patterns - broad link graphs, engagement, and consistent content quality - rather than any direct reliance of one on the other.[S1] This overlap may tighten over time as both continue to incorporate behavioral and brand signals.

Contradictions and data gaps in ChatGPT citation research

Unspecified ChatGPT version and timeframe

The study does not identify which ChatGPT version, training snapshot, or time window was analyzed.[S1] Given that models and data sources change over time, some relationships - especially around newer formats like LLMs.txt - may shift.

FAQ signals: mixed quantitative and model-based results

The raw numbers show FAQs and FAQ schema correlating with slightly fewer citations, yet the predictive model flags their absence as a negative factor.[S1] This tension suggests confounding by page type and underscores the risk of interpreting any single variable in isolation.

TLD expectations vs. data

Many SEO practitioners assume .gov and .edu domains automatically carry more weight. This dataset shows commercial domains averaging more citations than .gov and .edu sites, casting doubt on any simple TLD-based advantage in ChatGPT's source selection.[S1]

Limited view of user engagement and behavioral data

The study did not directly measure user engagement metrics (such as time on page, scroll depth, or bounce rate) or on-site conversion behavior. If ChatGPT or upstream systems use such signals, this analysis cannot capture their effect.

No cross-model comparison

The research focuses solely on ChatGPT. Other AI systems (for example, search-integrated assistants from different vendors) may reference different corpora or weight signals differently, so generalization beyond ChatGPT is uncertain.[S1]

Data appendix: ChatGPT citation metrics overview

Selected metrics from SE Ranking's analysis of 129,000 domains:[S1]

Factor Lower range (metric → citations) Higher range (metric → citations)
Referring domains ≤2,500 → 1.6-1.8 citations ≥350,000 → 8.4 citations
Domain Trust <43 → 1.6 citations 97-100 → 8.4 citations
Monthly organic traffic (domain) <190,000 → 2-2.9 citations >10M → 8.5 citations
Content length <800 words → 3.2 citations >2,900 words → 5.1 citations
Update recency Stale → 3.6 citations Updated <3 months → 6 citations
Statistics per page Minimal data → 2.8 citations ≥19 stats → 5.4 citations
Quora mentions ≤33 → 1.7 citations ~6.6M → 7 citations
Reddit mentions Minimal → 1.8 citations >10M → 7 citations
FCP >1.13 s → 2.1 citations <0.4 s → 6.7 citations
URL-keyword relevance score 0.84-1.00 → 2.7 citations 0.00-0.57 → 6.4 citations
Title keyword matching High → 2.8 citations Low → 5.9 citations

[S1] SE Ranking - large-scale study of factors correlating with ChatGPT citations across 129,000 domains and 216,524 pages (2025).

Quickly summarize and get insighs with: 
Author
Etavrian AI
Etavrian AI is developed by Andrii Daniv to produce and optimize content for etavrian.com website.
Reviewed
Andrew Daniv, Andrii Daniv
Andrii Daniv
Andrii Daniv is the founder and owner of Etavrian, a performance-driven agency specializing in PPC and SEO services for B2B and e‑commerce businesses.
Quickly summarize and get insighs with: 
Table of contents