AI Discovery Files vs Web Standards
AI Discovery Files don't replace existing web standards — they fill specific gaps that current standards leave open when AI systems try to understand your website.
Here's exactly what each existing standard does, what it doesn't do, and which AI Discovery File bridges the gap.
The Core Principle: Complement, Don't Replace
Every existing web standard was designed for a specific purpose — and each one does that job well. But none of them were designed to answer the questions AI systems now ask:
- What is this business, exactly? — Not what a page is about, but who the entity behind the website is
- What can I say about them? — Not whether I can crawl, but whether I can cite, quote, or recommend
- What should I never claim? — Explicit boundaries that prevent hallucinated services or brand conflation
- How should I refer to them? — Correct name, capitalisation, pronunciation, and terminology
AI Discovery Files answer these questions using simple, standardised file formats that AI systems can consume directly. The Interoperability Guide defines clear precedence rules for when information overlaps.
robots.txt
What robots.txt Does
Controls which pages and directories web crawlers can access. Uses User-agent and Disallow directives to grant or deny access at the URL path level. Universally supported by search engine crawlers since 1994.
What It Doesn't Do for AI
- No distinction between search crawlers and AI crawlers
- Can't express "allow crawling but don't use for training"
- No granular AI-specific permissions (citation, quoting, recommendation)
- Binary allow/disallow only — no nuance
What robots-ai.txt Adds
AI-specific crawler directives with granular control over how different AI systems interact with your content. Extends the concept of robots.txt into AI-specific territory.
- Named AI crawler directives (GPTBot, ClaudeBot, etc.)
- Separate permissions for crawling vs training vs citation
- Content-type specific rules (allow blog posts, restrict client data)
- Works alongside robots.txt — robots.txt always takes precedence
robots.txt always wins. If robots.txt blocks a crawler, robots-ai.txt cannot override that restriction. See interoperability rules.
Schema.org Structured Data
What Schema.org Does
Provides structured data vocabulary for describing content within HTML pages. Powers rich search results (knowledge panels, FAQs, product listings). Embedded as JSON-LD, Microdata, or RDFa within individual pages.
What It Doesn't Do for AI
- Page-scoped, not site-scoped — no single place for canonical identity
- Describes content, not business identity or AI permissions
- No mechanism for "do not claim we offer X" boundaries
- Requires parsing HTML — not a standalone file AI systems can fetch directly
What identity.json Adds
A single, authoritative, standalone JSON file at the website root declaring canonical business identity. AI systems can fetch one file and know exactly who you are.
- Site-wide canonical identity: name, description, URL, contact
- Explicit service declarations and exclusions
- Social profiles and authoritative URLs
- Standalone file — no HTML parsing required
identity.json takes precedence over Schema.org for business naming and identity when information conflicts. Schema.org remains authoritative for page-level content description. See interoperability rules.
security.txt
What security.txt Does
Standardised file (RFC 9116) for publishing security vulnerability disclosure policies. Tells security researchers how to report issues, who to contact, and what your disclosure policy is. Published at /.well-known/security.txt.
What It Doesn't Do for AI
- Addresses security researchers, not AI systems
- No mechanism for AI-specific usage permissions
- Can't declare what AI systems may or may not say about you
- No coverage for content licensing, citation, or attribution rules
What ai.txt Adds
The AI equivalent of security.txt. A simple text file declaring your website's policies for AI system interaction — what's permitted, what's restricted, and how attribution should work.
- Explicit AI usage permissions (training, citation, quoting)
- Content licensing declarations
- Attribution requirements for AI-generated citations
- Opt-in/opt-out signals for AI use cases
security.txt and ai.txt serve completely different audiences with no overlap. Both can and should coexist.
humans.txt
What humans.txt Does
A plain text file crediting the people behind a website — developers, designers, project managers. An informal convention (not an RFC) for human-readable acknowledgement. Published at the website root.
What It Doesn't Do for AI
- Informal, no standardised structure
- Credits individuals, doesn't define brand identity
- No naming rules, capitalisation guidelines, or terminology preferences
- Not designed for machine parsing
What brand.txt Adds
Machine-readable brand guidelines for AI systems. Defines how your brand name should be written, pronounced, and referenced — and what terms to avoid.
- Correct brand name, capitalisation, and spacing
- Pronunciation guides for voice AI systems
- Terms, abbreviations, and names to avoid
- Structured format that AI systems can parse reliably
humans.txt credits people; brand.txt defines how AI systems should refer to the brand. No conflict or overlap.
ads.txt
What ads.txt Does
Declares authorised digital advertising sellers for a domain (IAB Tech Lab standard). Prevents ad fraud by letting advertisers verify that ad inventory is sold through legitimate channels. Plain text file at the website root.
What It Doesn't Do for AI
- Specific to advertising supply chain
- No mechanism for AI interaction permissions
- Can't declare content licensing or usage restrictions
- Addresses advertising platforms, not AI systems
What ai.json Adds
The machine-parseable counterpart to ai.txt. Where ads.txt declares authorised ad sellers, ai.json declares authorised AI interaction rules in a structured JSON format with JSON Schema validation.
- Structured AI permissions and restrictions
- Programmatic access for automated tools and validators
- JSON Schema for automated validation
- Granular content-type specific rules
ai.json takes precedence over ai.txt for permissions when both exist and conflict. See interoperability rules.
What Only AI Discovery Files Provide
Some AI Discovery Files have no existing web standard equivalent at all. These address needs that simply didn't exist before AI systems became primary information sources.
llms.txt
AI-readable business context in Markdown format
No existing standard provides a structured, AI-optimised summary of a business. llms.txt gives AI systems a single document that explains who you are, what you do, and what context is important — written specifically for LLM consumption.
faq-ai.txt
Authoritative Q&A for AI retrieval
While Schema.org can mark up FAQ pages, faq-ai.txt is a standalone file of pre-authored answers specifically for AI citation. It ensures AI systems use your approved answers rather than generating their own from scattered page content.
developer-ai.txt
Technical context for AI systems
No standard communicates your technical platform, API availability, versioning conventions, or integration context to AI systems. developer-ai.txt gives AI the technical metadata it needs to provide accurate developer-facing responses.
llms.html
Human-readable reference version
A formatted HTML presentation of llms.txt content, giving humans a readable reference of what AI systems see. Bridges the gap between machine-readable and human-inspectable.
Quick Reference
| Existing Standard | Purpose | AI Gap | AI Discovery File |
|---|---|---|---|
robots.txt |
Crawler access control | No AI-specific granularity | robots-ai.txt |
| Schema.org | Page-level structured data | No site-wide canonical identity | identity.json |
security.txt |
Vulnerability disclosure | No AI usage permissions | ai.txt |
humans.txt |
Team credits | No brand/naming rules for AI | brand.txt |
ads.txt |
Authorised ad sellers | No AI interaction rules | ai.json |
| None | — | No AI-readable business summary | llms.txt |
| None | — | No pre-authored AI-ready FAQs | faq-ai.txt |
| None | — | No technical/developer context for AI | developer-ai.txt |
Frequently Asked Questions
Do AI Discovery Files replace robots.txt?
No. AI Discovery Files complement robots.txt, they do not replace it. robots.txt controls crawler access at the HTTP level. robots-ai.txt adds AI-specific granular directives, while other AI Discovery Files address identity, permissions, and context that robots.txt was never designed to handle.
Why not just use Schema.org for AI visibility?
Schema.org provides structured data for search engine result features like rich snippets. It is embedded within HTML pages and describes page-level content. AI Discovery Files are standalone root-level files that declare site-wide identity, permissions, and context specifically for AI systems. They address questions Schema.org was not designed for: What is the canonical business name? What can AI say about us? What services should AI not claim we offer?
Can I use both AI Discovery Files and existing standards?
Yes, and you should. AI Discovery Files are designed to work alongside existing standards, not replace them. The Interoperability Guide defines clear precedence rules for when information overlaps or conflicts between AI Discovery Files and standards like robots.txt or Schema.org.
What if AI Discovery Files conflict with robots.txt?
The Interoperability Guide establishes clear precedence: robots.txt always takes precedence over robots-ai.txt for access control. If robots.txt blocks a crawler, robots-ai.txt cannot override that. AI Discovery Files can only grant additional AI-specific permissions within the boundaries set by existing standards.
Start Implementing
AI Discovery Files work alongside your existing web infrastructure. You don't need to change anything you already have — just add the files that fill the gaps.
Quick Start Guide
Begin with just llms.txt and ai.txt. Add more files as your AI visibility needs grow.
Ecosystem Overview
See the full ecosystem: specification tiers, adoption pathways, and the reference implementation.
Check Your Website
Run the 365i AI Visibility Checker to see which AI Discovery Files your website is missing.
Generate AI Discovery Files from your dashboard
Using WordPress? Install the plugin and create all 10 files in minutes — no coding, no configuration files to edit manually.
Get the PluginRegister in the AI Visibility Directory
Once your AI Discovery Files are published, register your website in the AI Visibility Directory — the verified registry of websites implementing AI Discovery Files. Registration validates your implementation and lists your site for AI systems and industry peers to discover.
Card entry in the directory with automated file validation. Open to any site with a valid llms.txt file. No cost.
Dedicated profile page on the directory with dofollow backlinks to your website — a genuine SEO authority signal from a topically relevant, verified source. Includes an attribution badge and enhanced visibility.