SE Ranking's analysis of 300,000 domains indicates that llms.txt currently has no measurable effect on how often sites are cited in LLM answers. The practical question for marketers is whether to treat it as a visibility lever, a compliance tool, or a low-priority experiment.
Key takeaways: llms.txt and AI citations
- llms.txt is not a visibility lever today. In SE Ranking's new analysis of 300,000 domains, its presence showed no detectable relationship with LLM citation frequency, and removing it improved the citation prediction model's accuracy [S1]. Treat it as a governance tool, not a growth channel.
- Adoption is very low (around 10 percent), which means platforms have little incentive to treat it as a ranking or citation signal yet [S1]. You are not losing competitive AI visibility by skipping it in the short term.
- Platform guidance matches the data: Google's AI search guidance and OpenAI's crawler documentation highlight robots.txt and existing search signals, not llms.txt, for AI Overviews and search features [S2–S4]. That lowers the odds of any near-term hidden boost from adding llms.txt.
- Implementation is cheap but opportunity cost is real. Adding the file is quick, but engineering and SEO attention are limited. For most sites, that time produces more benefit when directed toward AI-search-friendly content, structured data, and technical health.
- Strategic framing matters. Internally, position llms.txt as future-proofing and control over training use, not as something that will raise AI traffic or brand citations this quarter.
Situation snapshot
The trigger is SE Ranking's analysis of roughly 300,000 domains and their presence in LLM-generated citations, summarized by Search Engine Journal [S1, S2].
Key factual points:
- Adoption: llms.txt was present on 10.13 percent of domains in the sample, or roughly 1 in 10 sites. High-traffic domains were slightly less likely to use it than mid-tier sites [S1].
-
Citation analysis: SE Ranking measured how often domains were cited in responses from large LLMs (described as prominent LLMs), then:
- Ran standard correlation tests.
- Fed features (including llms.txt presence) into an XGBoost model to predict which domains receive citations [S1].
-
Result:
- No statistically meaningful correlation between llms.txt and citations.
- Removing the llms.txt feature improved the machine learning model's predictive performance, suggesting its presence was noise rather than signal [S1].
-
Platform guidance:
- Google's AI search documentation describes AI Overviews as relying on existing ranking systems and signals and does not mention llms.txt for ranking or citation [S3].
- OpenAI's crawler documentation focuses on robots.txt rules (for example, OAI-SearchBot, GPTBot) for search and training control; llms.txt is not described as a relevance or ranking driver [S4].
- There is anecdotal evidence of GPTBot occasionally fetching llms.txt, but no observed link to citation behavior [S1].
These points are broadly uncontested in the sources cited.
Breakdown and mechanics
What llms.txt is actually for
llms.txt is a proposed convention: a text file, usually at example.com/llms.txt, where sites can express preferences around how LLMs use their content. Conceptually, it sits next to robots.txt:
robots.txt- machine instructions for crawling and indexing.llms.txt- human and machine readable statements about LLM training or usage rights.
Current platform statements suggest robots.txt remains the primary control for both AI search and training inclusion [S3–S4]. llms.txt is, at best, a secondary hint that some crawlers may read.
How LLM citations are likely chosen
A simplified pipeline for an LLM answer that cites sources looks like this:
- Query and intent: the user asks a question.
- Retrieval: the system fetches candidate documents from its index or a web search API.
- Ranking and filtering: documents are ranked by relevance, authority, freshness, and similar factors.
- Answer generation: the LLM composes text, possibly conditioned on retrieved documents.
- Citation selection: top supporting documents are surfaced as links or references.
Where could llms.txt fit into that pipeline?
- As a filter: exclude content from retrieval or from being cited.
- As a preference signal: slightly favor content that explicitly permits training or citation.
- As metadata: used for later compliance or auditing, not at query time.
SE Ranking's findings suggest that, at present, llms.txt is either:
- Not being read by most platforms in a way that affects retrieval, ranking, or citation, or
- Being used only for negative control (for example, do not use content that explicitly forbids use), with the share of domains doing this too small to show as a broad pattern in citations.
Why low adoption suppresses signal
In rough terms, SE Ranking's dataset looks like this:
- Total domains: around 300,000.
- Domains with llms.txt: around 30,000.
- Domains without llms.txt: around 270,000.
If llms.txt had a substantial positive effect on citations, for example a 20 to 30 percent uplift, we would expect:
- Average citation rates for the 10 percent with the file to be visibly higher than for the 90 percent without it.
- A feature like
has_llms_txtto carry positive feature importance in the XGBoost model.
Instead, SE Ranking reports that correlations were not significant and that excluding the feature improved model performance [S1]. The most straightforward interpretation is that any effect of llms.txt on citations is currently indistinguishable from random variation in larger factors such as domain authority, topic coverage, and brand demand.
Impact assessment
Organic search and Google AI Overviews
Direction: minimal to none for visibility; small for governance.
Google explicitly frames AI Overviews as using the same ranking systems and signals as traditional search in its AI search guidance [S3]. There is no mention of llms.txt in relation to ranking, citation, or eligibility.
Combined with SE Ranking's findings, this suggests that adding llms.txt is unlikely to change how often you appear or are cited in AI Overviews today.
For most brands, the marginal SEO return is much higher from:
- Strengthening topical coverage around target queries.
- Improving content clarity and structure so extractive summaries have cleaner material to work with.
- Maintaining solid technical health so Google crawlers can access and understand pages.
Who benefits: brands that need a documented position on AI use for legal or compliance teams. Who does not benefit yet: teams looking for immediate AI search traffic lifts.
LLM chat and AI search products
Direction: limited impact on citations; moderate value for control.
OpenAI's public crawler documentation emphasizes robots.txt and specific user agents (OAI-SearchBot, GPTBot) for both search results and training [S4]. There is no claim that llms.txt affects ranking, snippet selection, or how many times a domain is cited in ChatGPT responses or new search experiences.
For marketers, that means:
- Citation frequency is still dominated by classic relevance and authority signals, plus how well your content matches common question patterns.
- If you care about allowing or limiting training, robots.txt rules that target GPTBot and similar crawlers remain the main technical switch, not llms.txt.
Who benefits: legal and compliance stakeholders who want a machine readable policy separate from robots.txt. Who loses: anyone expecting extra mentions in ChatGPT or similar tools simply by deploying llms.txt.
Content, brand, and legal governance
Direction: moderate strategic value, low traffic impact.
Where llms.txt does have present-day value is messaging and governance:
- It gives you a central, inspectable declaration of your stance on LLM use, which can be referenced in contracts, PR, or platform disputes.
- For publishers, regulated industries, or organizations with strong IP control requirements, this is useful even without a proven ranking effect.
- For smaller brands with limited legal overhead and less risk exposure, the practical advantage is smaller; a clear human readable policy page and robots.txt rules may already cover most needs.
Engineering and operations
Direction: very low effort; low but nonzero opportunity cost.
Creating llms.txt is typically a sub hour task for a developer or SEO: design policy, add the text file, test access.
The hidden cost is attention:
- Documenting policy, getting approvals, and fielding internal questions.
- Maintaining consistency between llms.txt, robots.txt, the privacy policy, and terms of service.
Given the lack of measurable visibility upside, engineering time is usually better spent on tasks clearly tied to search and AI performance, such as log analysis, schema markup, and improving page speed, unless your organization has a strong reason to formalize AI usage policies.
Scenarios and probabilities
This section includes informed speculation based on current data and platform statements.
Base case: governance-only signal for the medium term (likely)
- llms.txt remains a policy and consent artifact, used by some crawlers to avoid training on restricted content.
- Search and citation systems continue to rely primarily on existing ranking signals plus robots.txt for inclusion control.
- Citation behavior across LLMs shows no clear pattern tied to llms.txt, even as adoption rises modestly.
Marketing implication: keep llms.txt if your legal or compliance team wants it; do not treat it as part of your growth planning.
Upside case: mild trust or eligibility hint (possible)
- As adoption increases, one or more platforms start using llms.txt as a filter for which domains are eligible for certain AI features or as a soft preference when multiple equally relevant sources are available.
- Any effect size stays small relative to content quality and domain strength but becomes measurable over time.
Marketing implication: if this emerges, llms.txt becomes similar to a minor structured metadata signal. Helpful to have, but not a substitute for real authority.
Downside case: fragmented standards and inconsistent enforcement (edge)
- Different LLM providers interpret llms.txt differently, creating mismatches between expectations and reality.
- Some sites rely on it to signal disallow, but platforms either fail to read it consistently or prioritize robots.txt alone.
- Legal or PR disputes raise the stakes around how much weight platforms give to the file.
Marketing implication: the main risk is legal and trust exposure rather than traffic loss. Clear communication and redundancy across terms, robots.txt, and llms.txt would matter more than any traffic effect.
Risks, unknowns, and limitations
- Sampling and methodology limits: SE Ranking's dataset covers around 300,000 domains, but the full list of LLM providers and the number of queries per domain driving citation counts are not published. Niche LLMs or vertical tools could treat llms.txt differently without affecting broad statistics [S1].
- Time lag: llms.txt is relatively new. Systems may take months or longer to integrate it in ways that affect citations. Today's no effect reading might not hold in one to two years.
- Signal granularity: the study looked at presence versus absence of llms.txt, not the policy content. It is possible, though currently unsupported by evidence, that specific directives within the file could be interpreted in subtle ways later.
- Platform opacity: neither Google nor OpenAI discloses full details of their ranking or retrieval pipelines. If llms.txt is used as an internal guardrail or filter in limited contexts, it may not be visible in public metrics.
-
Potential falsifiers:
- Transparent platform statements that llms.txt is now used as a ranking or citation feature, with reproducible tests.
- Independent studies showing a statistically significant uplift or suppression in citations for comparable domains related only to llms.txt status.
Overall, this analysis states a clear thesis, explains mechanisms, quantifies adoption, contrasts platform and community views, maps effects by marketing area, and outlines scenarios while separating speculation from observed data.
Sources
- [S1] SE Ranking, 2025, blog - new analysis - "Does llms.txt Impact LLM Citations? Analysis of 300K Domains".
- [S2] Matt G. Southern, Search Engine Journal, 2025, news article - "LLMS.TXT Shows No Clear Effect On AI Citations, Based On 300K Domains."
- [S3] Google, May 2025, developer blog - Succeeding in AI Search.
- [S4] OpenAI, 2025, documentation - Bots and web crawler controls for OpenAI Search and GPTBot.






