Q3 2026 Report

9.4% adoption

AI Discovery File Adoption Research

Measuring how the world's top websites prepare for AI systems. Based on a crawl of 1,995 domains.

1,744
Domains Crawled
of 1,995 total
9.4%
ADF Adoption
164 domains
2.2
Avg Readiness
out of 5.0
84.2%
No AI Policy
in robots.txt
You are viewing the archived Q3 2026 report. View latest report →

Summary

This Q3 2026 report analyses AI Discovery File adoption across 1,744 of the web's most prominent domains. 9.4% of domains have at least one AI Discovery File, 84.2% have no AI-specific crawler policy in their robots.txt, and the average AI readiness tier is 2.2 out of 5.0. Data is collected quarterly using the methodology described in our full methodology documentation.

AI Discovery Files are a set of 10 standardised root-level files — including llms.txt, ai.txt, ai.json, identity.json, and brand.txt — that help AI systems such as ChatGPT, Claude, and Gemini discover, interpret, and correctly represent a website. This research tracks their real-world adoption among the domains most likely to be referenced by AI systems when answering user questions.

Key Findings

The Q3 2026 crawl continues the upward trend in AI Discovery File adoption, which reached 9.4% of successfully crawled domains, up from 7.2% in Q2. The gain is driven almost entirely by llms.txt, now on 122 domains with 82 valid, complete implementations. The number of AI-Ready sites (Tier 4) grew from 33 to 44, and more organisations are explicitly permitting AI crawlers rather than staying silent. One caveat matters for reading these figures: this quarter 251 domains errored versus 90 in Q2, so the successfully crawled base fell to 1,744. Percentages are computed on that smaller sample, which softens some quarter-over-quarter comparisons that depend on the denominator.

  1. llms.txt adoption keeps climbing and now carries the whole category llms.txt grew from 93 to 122 domains (+31%), reaching 7% of the crawled sample. Valid, complete files rose from 61 to 82, and the wider quality overview shows complete files across all domains up from 64 to 85. This one file now accounts for the overwhelming majority of all AI Discovery File adoption. Every other file type stayed at or near zero, so the growth story and the concentration risk are the same story.
  2. AI-Ready sites grow by a third as Cloudflare and Adobe join the list Tier 4 (AI-Ready) domains rose from 33 to 44 (+33%). The top adopters now include Cloudflare (global rank 7) and Adobe (rank 68), alongside Bluehost, Dyson, Fox News, Klaviyo, and Groupon. These are high-traffic, infrastructure-scale properties, which matters because their choices tend to set defaults that smaller sites copy. Tier 5 (AI-Optimised) still stands at zero: no domain yet pairs three or more valid files with explicit crawler permission and Schema.org markup.
  3. More sites explicitly allow AI crawlers instead of staying silent Domains that explicitly permit AI crawlers rose from 23 to 31, lifting the explicit-allow share from 1.2% to 1.8%. Sites with no AI policy at all eased from 85% to 84.2%. Outright blocking barely moved (blocks-all 22 to 23). The shift is small in absolute terms but it points one way: where organisations are making a deliberate choice, more of them are choosing to opt in to AI access rather than block it.
  4. Crawl coverage dipped this quarter, which flatters some declines Only 1,744 of 1,995 domains were successfully crawled this quarter, down from 1,905 in Q2, because 251 domains errored versus 90. That matters when reading softer numbers: Schema.org presence fell from 29.7% to 26.3% and Partially Ready (Tier 3) dropped from 21.7% to 18.5%, but both are partly a function of the smaller, slightly different sample rather than a genuine retreat. The average readiness score held flat at 2.2 out of 5.0. We flag this openly rather than presenting the declines as behavioural change we cannot prove.
  5. Adoption is still one file deep Of the ten AI Discovery File types we track, only llms.txt shows meaningful adoption. ai.json, identity.json, brand.txt, faq-ai.txt, and developer-ai.txt sit at zero to two domains each, and llms.html actually slipped slightly from 47 to 46 found. Publishers are adopting the single best-known file and stopping there, which leaves identity, brand, and FAQ signals largely unaddressed across the web.
  6. Zero sites reference a formal specification All 164 domains with an AI Discovery File were published without any reference to a formal specification, exactly as in Q2. Adoption continues to run ahead of standardisation: publishers are writing these files by hand or from ad hoc templates, with no shared schema or version to validate against. That gap is the clearest opening for a canonical, machine-checkable standard to become the reference point.

— AI Visibility Research, July 2026

Changes from Q2 2026

Quarter-over-quarter changes in key metrics between Q2 2026 and Q3 2026.

ADF Adoption
7.2% 9.4%
+2.2
Avg Readiness Score
2.2 2.2
0.0
Domains Crawled
1,905 1,744
-161
No AI Policy
85.0% 84.2%
-0.8pp
Per-file adoption change: Q2 2026 to Q3 2026
File Q2 2026 Q3 2026 Change
llms.txt 4.9% 7.0% +2.1
llms.html 2.5% 2.6% +0.1
ai.txt 0.1% 0.1% 0.0
ai.json 0.1% 0.1% 0.0
identity.json 0.0% 0.0% 0.0
brand.txt 0.1% 0.0% -0.1
faq-ai.txt 0.1% 0.0% -0.1
developer-ai.txt 0.0% 0.0% 0.0
robots-ai.txt 0.1% 0.1% 0.0

New Top Adopters

  • adobe.com
  • bluehost.com
  • bunkbedsstore.uk
  • cloudflare.com
  • dyson.co.uk
  • foxnews.com
  • frontiersin.org
  • groupon.co.uk
  • gumgum.com
  • klaviyo.com
  • life360.com

No Longer in Top 20

  • dell.com
  • english-heritage.org.uk
  • greenpeace.org.uk
  • hobbycraft.co.uk
  • mailchimp.com
  • netgear.com
  • nvidia.com
  • onetrust.com
  • opera.com
  • optimizely.com
  • plesk.com

ADF Adoption by File Type

How many of the top websites have each AI Discovery File — and how many of those files pass structural validation. Files are checked at their canonical root-level URL (e.g., example.com/llms.txt) and validated against the ADF specification.

View data table
AI Discovery File adoption across 1,744 domains
File Found Valid Complete
llms.txt 122 82 82
llms.html 46 9 3
ai.txt 1 0 0
ai.json 2 0 0
identity.json 0 0 0
brand.txt 0 0 0
faq-ai.txt 0 0 0
developer-ai.txt 0 0 0
robots-ai.txt 1 1 0

AI Crawler Access Policies

How websites use robots.txt to manage access for 15 known AI user agents — from OpenAI's GPTBot to Anthropic's ClaudeBot. Each domain is classified into one of five access policies based on its aggregate behaviour across all agents. The per-agent table below shows which AI crawlers are most frequently blocked.

AI Crawler Company Purpose Blocked Blocked % Allowed Allowed %
CCBot Common Crawl Training 186 10.7% 14 0.8%
ClaudeBot Anthropic Training 174 10.0% 20 1.1%
GPTBot OpenAI Training 173 9.9% 29 1.7%
Bytespider ByteDance Training 167 9.6% 6 0.3%
meta-externalagent Meta Training 151 8.7% 4 0.2%
Applebot-Extended Apple Training 146 8.4% 10 0.6%
PerplexityBot Perplexity Search 134 7.7% 30 1.7%
Google-Extended Google Training 125 7.2% 22 1.3%
Diffbot Diffbot Extraction 114 6.5% 2 0.1%
cohere-ai Cohere Training 108 6.2% 3 0.2%
Amazonbot Amazon Training 105 6.0% 10 0.6%
OAI-SearchBot OpenAI Search 100 5.7% 29 1.7%
ChatGPT-User OpenAI Retrieval 83 4.8% 31 1.8%
Claude-User Anthropic Retrieval 75 4.3% 12 0.7%
FacebookBot Meta Preview 75 4.3% 3 0.2%
View data table
AI crawler access policy distribution in robots.txt
Policy Domains Percentage
Blocks All AI 23 1.3%
Blocks Selectively 217 12.4%
Rate-Limits AI 4 0.2%
Explicitly Allows 31 1.8%
No AI Policy 1,469 84.2%

File Quality Distribution

Among the ADF files that were found, how many meet the full specification versus providing only minimal content or containing errors. Quality is assessed using per-file structural checks — required fields must pass for a file to be considered valid; recommended fields distinguish "complete" from "minimal" implementations.

49.1%
46.2%
Complete 49.1% Minimal 4.6% Invalid 46.2%

AI Readiness Tiers

Each domain receives a readiness tier from 0 (Unaware) to 5 (AI-Optimised) based on three inputs: valid ADF file count, AI crawler policy in robots.txt, and Schema.org presence on the homepage. The tier model is deterministic with no opaque weights — the full calculation logic is published.

View data table
AI readiness tier distribution (average score: 2.2 / 5.0)
Tier Domains Percentage
Tier 5: AI-Optimised 0 0.0%
Tier 4: AI-Ready 44 2.5%
Tier 3: Partially Ready 322 18.5%
Tier 2: Passive 1,316 75.5%
Tier 1: Actively Blocking 23 1.3%
Tier 0: Unaware 39 2.2%

ADF vs Other Web Standards

Comparing AI Discovery File adoption against established web standards. This contextualises where ADF adoption sits relative to conventions like robots.txt (RFC 9309), ads.txt (IAB Tech Lab), security.txt (RFC 9116), and humans.txt, all of which also require placing files at the domain root.

View data table
AI Discovery File adoption compared with established web standards
Standard Adoption
robots.txt 50.2%
ads.txt 16.1%
Schema.org 26.3%
security.txt 14.6%
humans.txt 2.3%
Any ADF file 9.4%

Notable Adopters

The top 20 domains by AI readiness tier, showing which high-profile websites are leading ADF adoption. Readiness tiers are calculated using the combinatorial scoring model.

Domain Rank Category Files Found Files Valid Readiness
adobe.com 68 Global Top 1,000 1 1 AI-Ready
asus.com 710 Global Top 1,000 1 1 AI-Ready
bluehost.com 666 Global Top 1,000 2 1 AI-Ready
bmmagazine.co.uk 739 UK Top 1,000 1 1 AI-Ready
bunkbedsstore.uk 989 UK Top 1,000 1 1 AI-Ready
classlink.com 854 Global Top 1,000 1 1 AI-Ready
cloudflare.com 7 Global Top 1,000 1 1 AI-Ready
cloudinary.com 725 Global Top 1,000 1 1 AI-Ready
datadoghq.com 788 Global Top 1,000 1 1 AI-Ready
dynatrace.com 546 Global Top 1,000 1 1 AI-Ready
dyson.co.uk 462 UK Top 1,000 1 1 AI-Ready
energysavingtrust.org.uk 489 UK Top 1,000 1 1 AI-Ready
foxnews.com 450 Global Top 1,000 1 1 AI-Ready
frontiersin.org 917 Global Top 1,000 1 1 AI-Ready
groupon.co.uk 553 UK Top 1,000 1 1 AI-Ready
gumgum.com 722 Global Top 1,000 1 1 AI-Ready
hostgator.com.br 842 Global Top 1,000 1 1 AI-Ready
kingsfund.org.uk 905 UK Top 1,000 1 1 AI-Ready
klaviyo.com 675 Global Top 1,000 1 1 AI-Ready
life360.com 753 Global Top 1,000 1 1 AI-Ready

Download the Data

Raw datasets from this quarter's crawl, licensed under CC BY 4.0. Use them for your own research, analysis, or reporting. When citing, please reference the quarter (e.g., "Q3 2026") and link to the methodology.

Methodology

How We Collect This Data

Our crawler checks the top 1,000 global and top 1,000 UK domains (deduplicated to ~1,995) for all 10 AI Discovery Files, validates each against the specification, analyses robots.txt AI crawler policies across 15 known agents, and scores each domain's overall AI readiness using a deterministic tier model. The full methodology — including validation rules, soft 404 detection, redirect classification, and scoring logic — is published for transparency.

Full methodology