129K-Site Study Reveals What Really Gets Your Brand Cited By ChatGPT

New data on factors driving ChatGPT citations shows that classic SEO signals (links, authority, traffic, page experience) still map closely to how OpenAI's assistant selects sources, with content depth, freshness, and brand presence providing a clear lift in citation likelihood [S1][S2].

New Data: Top Factors Influencing ChatGPT Citations — New research highlights the top factors influencing how often ChatGPT cites specific domains.

Executive snapshot: how ChatGPT citations are shaped

SE Ranking analyzed 129,000 domains and 216,524 pages in 20 niches to model which factors correlate most strongly with being cited by ChatGPT; referring domains emerged as the top predictor [S1].
Domains with 2,500 or fewer referring domains averaged about 1.6-1.8 ChatGPT citations, while those with more than 350,000 referring domains averaged 8.4 citations; a sharp jump appeared around 32,000 referring domains (from 2.9 to 5.6 citations) [S1].
High Domain Trust (91-96) correlated with about 6 citations, rising to 8.4 when Domain Trust reached 97-100, while Page Trust plateaued above a moderate threshold [S1].
Long, updated, expert-backed content (2,900+ words, recent updates, quotes, and 19+ data points) and fast FCP (under 0.4 seconds) typically doubled citation counts compared with shorter, older, slower content [S1].
Presence on Reddit, Quora, and major review platforms (Trustpilot, G2, Capterra, Sitejabber, Yelp) aligned with higher citations, often in the 4.6-7 range versus roughly 1.8 for brands with no such presence [S1].

For marketers, this suggests that link authority, authoritative long-form content, technical performance, and active brand presence across communities and review sites are the most defensible levers for influencing ChatGPT citations.

Method and source notes for ChatGPT citation study

SE Ranking conducted a large-scale correlational study on how often domains are cited by ChatGPT and which site attributes align with higher citation counts [S1]. The dataset covered 129,000 unique domains and 216,524 pages across 20 industry niches, with citation counts measured at the domain and page level and compared against SE Ranking's proprietary SEO and performance metrics [S1].

Key variable groups included: backlink profile metrics (referring domains, link diversity), Domain Trust and Page Trust scores, organic traffic (overall and homepage), average Google ranking positions, content characteristics (length, structure, updates, expert quotes, statistics, FAQs), social and review signals (mentions on Reddit/Quora, listings on review platforms), and technical performance indicators (Core Web Vitals-style metrics such as First Contentful Paint, Speed Index, Interaction to Next Paint) [S1].

The analysis used statistical modeling to rank factor importance and examine threshold effects (for example, at 32,000 referring domains) but remained observational; it did not reverse-engineer ChatGPT's internal algorithms or prove causation [S1]. SE Ranking did not specify which ChatGPT version, timeframe, or query set underpinned the citation data, which limits temporal and version-specific conclusions [S1].

Most metrics (Domain Trust, Page Trust, traffic estimates) are model-based approximations, not first-party data from OpenAI or Google, and several factors are interrelated (for example, high traffic often coexists with strong backlink profiles), which can blur individual factor effects [S1].

Backlinks and trust as signals for AI citations

Backlink profile strength emerged as the single strongest predictor of citation likelihood in ChatGPT responses [S1]. Domains with up to 2,500 referring domains averaged roughly 1.6-1.8 citations. In contrast, domains with more than 350,000 referring domains averaged 8.4 citations, showing a steep association between link diversity and AI visibility at scale [S1].

A notable threshold appeared at 32,000 referring domains: domains below that level averaged 2.9 citations, while those above it jumped to 5.6 citations, nearly doubling [S1]. Domain Trust scores showed a similar pattern. Domains with scores below 43 averaged about 1.6 citations, while those in the 91-96 range averaged 6 citations; scores of 97-100 again matched 8.4 citations on average [S1].

Page-level authority mattered less once a minimum was reached. Any page with a Page Trust score of 28 or higher tended to achieve roughly the same citation rate (around 8.3 on average), indicating that ChatGPT appears to weigh domain-level authority more heavily than variations between individual pages once they clear a modest quality bar [S1].

Contrary to common assumptions, .gov and .edu domains did not automatically outperform commercial sites: government and educational domains averaged 3.2 citations compared with 4.0 for domains without trusted-zone TLDs [S1]. SE Ranking's authors noted that content quality and value, rather than domain extension alone, aligned better with citations [S1].

[Interpretation - Likely]

For most brands, sustained link acquisition that improves Domain Trust and referring-domain diversity is one of the most reliable ways to boost ChatGPT citation probability, while chasing .edu or .gov links solely for their TLD is less justified by this dataset.

Traffic and Google rankings in ChatGPT source selection

Domain-level traffic was the second most important factor, but its influence appeared mainly at the upper end of the traffic spectrum [S1]. Sites under roughly 190,000 monthly visitors clustered in a narrow band of 2-2.9 average citations, regardless of whether they received 20 or 20,000 visitors [S1]. The correlation strengthened after that threshold: domains with more than 10 million monthly visitors averaged 8.5 citations, indicating that very high-traffic sites are much more likely to appear as ChatGPT sources [S1].

Homepage traffic was a differentiator. Sites with at least 7,900 organic visitors to their main page showed the highest citation rates, suggesting ChatGPT may treat strong performance of a home or entry page as a key authority indicator [S1]. Average Google ranking positions also correlated with citation rates: pages ranking between positions 1 and 45 in Google search averaged 5 ChatGPT citations, whereas those ranking 64-75 averaged only 3.1 [S1].

SE Ranking noted that this pattern does not prove ChatGPT relies directly on Google's index but indicates that both systems appear to reward similar signals of authority and quality [S1].

[Interpretation - Likely]

Building organic search performance, especially for core pages, likely improves ChatGPT source visibility as well, because the same attributes that support higher rankings and traffic (trusted links, quality content, user signals) are also associated with more citations.

Content depth, structure, and freshness for AI visibility

Content characteristics showed consistent, measurable relationships with citation counts. Articles under 800 words averaged 3.2 ChatGPT citations, while those longer than 2,900 words averaged 5.1, indicating that detailed coverage corresponds with increased use as a source [S1]. Internal structure also mattered. Pages organized into sections of 120-180 words between headings averaged 4.6 citations, while pages with very short subsections under 50 words averaged only 2.7 citations [S1].

Signals of expertise and evidence were especially aligned with citations. Content that included expert quotes averaged 4.1 citations compared with 2.4 for pages without such quotes [S1]. Pages that contained 19 or more statistical data points averaged 5.4 citations, almost double the 2.8 citations seen on pages with minimal data [S1].

Freshness produced one of the clearest gaps: pages updated within the last three months averaged 6 citations, versus 3.6 citations for older content [S1]. Surprisingly, pages including FAQ sections had slightly fewer citations (3.8) than those without (4.1), although SE Ranking's predictive model still treated the absence of an FAQ section as a negative feature, likely because FAQs are common on simple support pages that rarely attract citations [S1]. Question-style headings underperformed straightforward topical headings, averaging 3.4 versus 4.3 citations [S1].

[Interpretation - Likely]

Detailed, well-structured, frequently updated content that incorporates expert commentary and concrete statistics is more likely to be cited by ChatGPT than short, generic, or stale pages, and overly fragmented layouts or question-based headings may not help AI visibility as often assumed.

Social signals, community sites, and review platforms

Brand presence across discussion platforms and review sites correlated strongly with ChatGPT citations [S1]. On Quora, domains with up to 33 brand mentions averaged 1.7 citations, while those with around 6.6 million mentions averaged 7 citations, a roughly fourfold increase [S1]. Reddit showed similar patterns: domains with more than 10 million mentions averaged 7 citations versus 1.8 for domains with minimal activity [S1].

For smaller or less-established domains, SE Ranking highlighted Reddit and Quora engagement as a potential path to building authority signals that ChatGPT appears to recognize, in a manner somewhat analogous to how larger brands benefit from scale in backlinks and organic traffic [S1].

Review platforms also aligned with higher citation rates. Domains listed across multiple review sites such as Trustpilot, G2, Capterra, Sitejabber, and Yelp typically earned between 4.6 and 6.3 citations on average, while domains absent from such platforms averaged only 1.8 citations [S1]. This suggests that external validation from user reviews and discussion threads may help ChatGPT identify trustworthy commercial and software brands [S1].

[Interpretation - Tentative]

Building authentic presence on Reddit, Quora, and review platforms through useful participation and product or service quality that earns reviews appears to contribute authority signals that increase the odds of being cited by ChatGPT, especially for brands without large backlink or traffic footprints.

Page speed and technical quality as ChatGPT trust indicators

Technical performance metrics correlated with ChatGPT citation frequency, with page speed standing out [S1]. Pages with First Contentful Paint (FCP) under 0.4 seconds averaged 6.7 citations, while those with FCP above 1.13 seconds dropped to 2.1 citations [S1]. Speed Index showed a similar pattern, with solid performance below 1.14 seconds and steep declines above 2.2 seconds [S1].

One counterintuitive outcome came from Interaction to Next Paint (INP). Pages with ultra-fast INP under 0.4 seconds averaged only 1.6 citations, while those with moderate INP between 0.8 and 1.0 seconds averaged 4.5 citations [S1]. SE Ranking suggested that extremely fast INP scores likely reflect very simple or static pages (for example, thin landing pages) that lack the depth and complexity seen in more authoritative content, which may explain the lower citation rates despite excellent interactivity metrics [S1].

Overall, the model ranked page speed as a meaningful factor, but not independent of other metrics; high-performing domains often combined fast load times with strong backlinks, content quality, and brand signals [S1].

[Interpretation - Likely]

Maintaining fast FCP and strong overall load performance appears supportive of ChatGPT citation potential, but speed alone does little if the underlying content is thin; AI systems seem to reward fast, substantial content rather than bare-bones pages that happen to load instantly.

URL and title patterns plus tactics that underperformed

URL and title optimization showed patterns that run against traditional keyword-heavy tactics. Pages whose URLs had low semantic relevance to the target keyword (0.00-0.57 on SE Ranking's scale) averaged 6.4 citations, while URLs with the highest semantic relevance (0.84-1.00) averaged only 2.7 citations [S1]. Titles followed the same trend: low keyword-matching titles averaged 5.9 citations versus 2.8 for highly keyword-optimized titles [S1].

SE Ranking concluded that ChatGPT appears to favor URLs and titles that clearly describe the broader topic rather than those tightly engineered around a single phrase [S1]. Heavy keyword repetition may be more associated with content created for search targeting than with comprehensive reference material, which could lower its usefulness for AI-generated answers.

Several popular AI-optimization tactics underperformed. Pages using FAQ schema markup averaged 3.6 citations compared with 4.2 for pages without it [S1]. LLMs.txt files showed negligible influence on citation likelihood, and outbound links to high-authority sites had minimal measured effect in this dataset [S1].

[Interpretation - Tentative]

Descriptive, topic-level URLs and titles that avoid keyword stuffing appear more aligned with ChatGPT's source selection than aggressively optimized strings, and current AI-specific tactics like LLMs.txt or FAQ schema alone do not show meaningful impact without strong fundamentals.

Interpretation and implications for SEO and marketing strategy

[Interpretation - Likely]

Classic SEO fundamentals - diverse referring domains, high Domain Trust, solid organic traffic, strong rankings, and fast pages - map closely to higher ChatGPT citation counts [S1]. Investments that already pay off in search visibility are likely to support AI visibility as well.
Content strategy should prioritize depth, structure, and evidence: long-form pages (around 3,000+ words), logical 120-180-word sections, expert commentary, and dense use of reliable statistics are strongly associated with more frequent citations [S1].
Regular updates matter. Rewriting or refreshing high-value pages at least every few months likely improves their attractiveness as AI sources [S1].

[Interpretation - Tentative]

AI systems and Google appear to respond to overlapping authority and quality signals, meaning a unified approach to SEO and AI visibility is more efficient than treating them as separate problems [S1].
For small or mid-sized brands, sustained activity on Reddit and Quora plus credible listings on review platforms may partly substitute for the scale advantages of dominant domains, especially if paired with focused link building and strong content [S1].
Question-based headings and FAQ blocks may still help human users or specific search features, but they should not replace clear topical headings and comprehensive body content if AI visibility is a priority [S1].

[Interpretation - Speculative]

The negative association between ultra-fast INP and citations may reflect page type rather than speed itself; AI models might implicitly favor pages with richer layouts and interaction patterns that often coincide with deeper content, even at the cost of slightly slower INP.
Over time, as OpenAI refines ChatGPT's browsing and citation behavior, signals like structured data for AI (for example, LLMs.txt or future analogues) could gain importance, even though this dataset shows little current effect [S1].

For planning, the evidence points toward a practical hierarchy: build domain authority and traffic, create and maintain rich expert content, maintain competitive performance, and cultivate brand proof off-site. AI-specific tweaks appear secondary to these broader efforts.

Contradictions, data gaps, and open questions

The dataset contains several findings that run counter to common industry assumptions. Question-style headings and FAQ sections, often recommended for voice search and user support, correlated with slightly lower citation counts, even though SE Ranking's model treated the absence of FAQs as a negative feature once page type was controlled for [S1]. This suggests that FAQ presence may be a proxy for simpler informational pages rather than a direct negative signal.

The underperformance of FAQ schema and the negligible impact of LLMs.txt contradict some early advice on optimizing for AI assistants, but these results are limited to ChatGPT and the specific timeframe and configuration SE Ranking used, which was not fully disclosed [S1]. It remains unclear whether these patterns will hold as OpenAI updates models and retrieval pipelines.

Key gaps include: no public detail on the query set or vertical mix used to collect ChatGPT citations, potential bias toward niches where SE Ranking has stronger data coverage, and limited visibility into how often ChatGPT accessed external pages versus internal training data [S1]. The strong alignment with Google rankings suggests shared evaluation patterns but does not resolve whether ChatGPT consults live search results or relies on internal authority scoring.

For now, these findings should be treated as directional guidance for how one generation of ChatGPT behaves, not a definitive specification of AI ranking rules.

Data appendix: summary of ChatGPT citation factors

Core dataset and variables

129,000 unique domains, 216,524 pages, 20 niches [S1].
Outcome: average ChatGPT citations per domain or page [S1].
Key groups: backlinks (referring domains), Domain Trust and Page Trust, domain and homepage traffic, average Google rankings, content length and structure, expert quotes and statistics, update recency, FAQ presence and schema, question-style versus statement headings, Reddit and Quora mentions, review-platform presence, Core Web Vitals-style metrics (FCP, Speed Index, INP), URL or title to keyword similarity, LLMs.txt, and outbound links [S1].

Illustrative factor ranges and associated citations

Referring domains: 2,500 or fewer leads to about 1.6-1.8 citations; more than 32,000 leads to 5.6; more than 350,000 leads to 8.4 [S1].
Domain Trust: under 43 leads to 1.6 citations; 91-96 leads to 6; 97-100 leads to 8.4 [S1].
Traffic: under 190,000 monthly visitors leads to 2-2.9 citations; more than 10 million leads to 8.5 [S1].
Content: under 800 words leads to 3.2 citations; more than 2,900 words leads to 5.1; updated within 3 months leads to 6 versus 3.6 when older [S1].
Evidence: expert quotes present leads to 4.1 versus 2.4; 19 or more statistical data points leads to 5.4 versus 2.8 [S1].
Community and review: strong Reddit or Quora presence and multiple review-site listings typically lead to 4.6-7 citations versus roughly 1.8 when absent [S1].
Speed: FCP under 0.4 seconds leads to 6.7 versus more than 1.13 seconds leading to 2.1; INP of 0.8-1.0 seconds leads to 4.5 versus under 0.4 seconds leading to 1.6 [S1].
URL and title optimization: low keyword similarity leads to 5.9-6.4 citations versus high similarity leading to 2.7-2.8 [S1].

Sources

[S1] SE Ranking, "Ranking Factors for ChatGPT" (2025) - primary study on ChatGPT citation factors.
[S2] Matt G. Southern, Search Engine Journal coverage of SE Ranking's findings (2025).