AI Discovery Files vs Web Standards

AI Discovery Files don't replace existing web standards — they fill specific gaps that current standards leave open when AI systems try to understand your website.

Here's exactly what each existing standard does, what it doesn't do, and which AI Discovery File bridges the gap.

The Core Principle: Complement, Don't Replace

Every existing web standard was designed for a specific purpose — and each one does that job well. But none of them were designed to answer the questions AI systems now ask:

What is this business, exactly? — Not what a page is about, but who the entity behind the website is
What can I say about them? — Not whether I can crawl, but whether I can cite, quote, or recommend
What should I never claim? — Explicit boundaries that prevent hallucinated services or brand conflation
How should I refer to them? — Correct name, capitalisation, pronunciation, and terminology

AI Discovery Files answer these questions using simple, standardised file formats that AI systems can consume directly. The Interoperability Guide defines clear precedence rules for when information overlaps.

Existing Standard

`robots.txt`

What `robots.txt` Does

Controls which pages and directories web crawlers can access. Uses User-agent and Disallow directives to grant or deny access at the URL path level. Universally supported by search engine crawlers since 1994.

What It Doesn't Do for AI

No distinction between search crawlers and AI crawlers
Can't express "allow crawling but don't use for training"
No granular AI-specific permissions (citation, quoting, recommendation)
Binary allow/disallow only — no nuance

What `robots-ai.txt` Adds

AI-specific crawler directives with granular control over how different AI systems interact with your content. Extends the concept of robots.txt into AI-specific territory.

Named AI crawler directives (GPTBot, ClaudeBot, etc.)
Separate permissions for crawling vs training vs citation
Content-type specific rules (allow blog posts, restrict client data)
Works alongside robots.txt — robots.txt always takes precedence

Precedence: robots.txt always wins. If robots.txt blocks a crawler, robots-ai.txt cannot override that restriction. See interoperability rules.

Existing Standard

Schema.org Structured Data

What Schema.org Does

Provides structured data vocabulary for describing content within HTML pages. Powers rich search results (knowledge panels, FAQs, product listings). Embedded as JSON-LD, Microdata, or RDFa within individual pages.

What It Doesn't Do for AI

Page-scoped, not site-scoped — no single place for canonical identity
Describes content, not business identity or AI permissions
No mechanism for "do not claim we offer X" boundaries
Requires parsing HTML — not a standalone file AI systems can fetch directly

What `identity.json` Adds

A single, authoritative, standalone JSON file at the website root declaring canonical business identity. AI systems can fetch one file and know exactly who you are.

Site-wide canonical identity: name, description, URL, contact
Explicit service declarations and exclusions
Social profiles and authoritative URLs
Standalone file — no HTML parsing required

Precedence: identity.json takes precedence over Schema.org for business naming and identity when information conflicts. Schema.org remains authoritative for page-level content description. See interoperability rules.

Existing Standard

`security.txt`

What `security.txt` Does

Standardised file (RFC 9116) for publishing security vulnerability disclosure policies. Tells security researchers how to report issues, who to contact, and what your disclosure policy is. Published at /.well-known/security.txt.

What It Doesn't Do for AI

Addresses security researchers, not AI systems
No mechanism for AI-specific usage permissions
Can't declare what AI systems may or may not say about you
No coverage for content licensing, citation, or attribution rules

What `ai.txt` Adds

The AI equivalent of security.txt. A simple text file declaring your website's policies for AI system interaction — what's permitted, what's restricted, and how attribution should work.

Explicit AI usage permissions (training, citation, quoting)
Content licensing declarations
Attribution requirements for AI-generated citations
Opt-in/opt-out signals for AI use cases

Complementary: security.txt and ai.txt serve completely different audiences with no overlap. Both can and should coexist.

Existing Standard

`humans.txt`

What `humans.txt` Does

A plain text file crediting the people behind a website — developers, designers, project managers. An informal convention (not an RFC) for human-readable acknowledgement. Published at the website root.

What It Doesn't Do for AI

Informal, no standardised structure
Credits individuals, doesn't define brand identity
No naming rules, capitalisation guidelines, or terminology preferences
Not designed for machine parsing

What `brand.txt` Adds

Machine-readable brand guidelines for AI systems. Defines how your brand name should be written, pronounced, and referenced — and what terms to avoid.

Correct brand name, capitalisation, and spacing
Pronunciation guides for voice AI systems
Terms, abbreviations, and names to avoid
Structured format that AI systems can parse reliably

Complementary: humans.txt credits people; brand.txt defines how AI systems should refer to the brand. No conflict or overlap.

Existing Standard

`ads.txt`

What `ads.txt` Does

Declares authorised digital advertising sellers for a domain (IAB Tech Lab standard). Prevents ad fraud by letting advertisers verify that ad inventory is sold through legitimate channels. Plain text file at the website root.

What It Doesn't Do for AI

Specific to advertising supply chain
No mechanism for AI interaction permissions
Can't declare content licensing or usage restrictions
Addresses advertising platforms, not AI systems

What `ai.json` Adds

The machine-parseable counterpart to ai.txt. Where ads.txt declares authorised ad sellers, ai.json declares authorised AI interaction rules in a structured JSON format with JSON Schema validation.

Structured AI permissions and restrictions
Programmatic access for automated tools and validators
JSON Schema for automated validation
Granular content-type specific rules

Precedence: ai.json takes precedence over ai.txt for permissions when both exist and conflict. See interoperability rules.

What Only AI Discovery Files Provide

Some AI Discovery Files have no existing web standard equivalent at all. These address needs that simply didn't exist before AI systems became primary information sources.

`llms.txt`

AI-readable business context in Markdown format

No existing standard provides a structured, AI-optimised summary of a business. llms.txt gives AI systems a single document that explains who you are, what you do, and what context is important — written specifically for LLM consumption.

`faq-ai.txt`

Authoritative Q&A for AI retrieval

While Schema.org can mark up FAQ pages, faq-ai.txt is a standalone file of pre-authored answers specifically for AI citation. It ensures AI systems use your approved answers rather than generating their own from scattered page content.

`developer-ai.txt`

Technical context for AI systems

No standard communicates your technical platform, API availability, versioning conventions, or integration context to AI systems. developer-ai.txt gives AI the technical metadata it needs to provide accurate developer-facing responses.

`llms.html`

Human-readable reference version

A formatted HTML presentation of llms.txt content, giving humans a readable reference of what AI systems see. Bridges the gap between machine-readable and human-inspectable.

Quick Reference

Existing Standard	Purpose	AI Gap	AI Discovery File
`robots.txt`	Crawler access control	No AI-specific granularity	`robots-ai.txt`
Schema.org	Page-level structured data	No site-wide canonical identity	`identity.json`
`security.txt`	Vulnerability disclosure	No AI usage permissions	`ai.txt`
`humans.txt`	Team credits	No brand/naming rules for AI	`brand.txt`
`ads.txt`	Authorised ad sellers	No AI interaction rules	`ai.json`
None	—	No AI-readable business summary	`llms.txt`
None	—	No pre-authored AI-ready FAQs	`faq-ai.txt`
None	—	No technical/developer context for AI	`developer-ai.txt`

Frequently Asked Questions

Do AI Discovery Files replace robots.txt?

No. AI Discovery Files complement robots.txt, they do not replace it. robots.txt controls crawler access at the HTTP level. robots-ai.txt adds AI-specific granular directives, while other AI Discovery Files address identity, permissions, and context that robots.txt was never designed to handle.

Why not just use Schema.org for AI visibility?

Schema.org provides structured data for search engine result features like rich snippets. It is embedded within HTML pages and describes page-level content. AI Discovery Files are standalone root-level files that declare site-wide identity, permissions, and context specifically for AI systems. They address questions Schema.org was not designed for: What is the canonical business name? What can AI say about us? What services should AI not claim we offer?

Can I use both AI Discovery Files and existing standards?

Yes, and you should. AI Discovery Files are designed to work alongside existing standards, not replace them. The Interoperability Guide defines clear precedence rules for when information overlaps or conflicts between AI Discovery Files and standards like robots.txt or Schema.org.

What if AI Discovery Files conflict with robots.txt?

The Interoperability Guide establishes clear precedence: robots.txt always takes precedence over robots-ai.txt for access control. If robots.txt blocks a crawler, robots-ai.txt cannot override that. AI Discovery Files can only grant additional AI-specific permissions within the boundaries set by existing standards.

Start Implementing

AI Discovery Files work alongside your existing web infrastructure. You don't need to change anything you already have — just add the files that fill the gaps.

Quick Start Guide

Begin with just llms.txt and ai.txt. Add more files as your AI visibility needs grow.

Ecosystem Overview

See the full ecosystem: specification tiers, adoption pathways, and the reference implementation.

Check Your Website

Run the 365i AI Visibility Checker to see which AI Discovery Files your website is missing.

Free WordPress Plugin

Generate AI Discovery Files from your dashboard

Using WordPress? Install the plugin and create all 10 files in minutes — no coding, no configuration files to edit manually.

Get the Plugin

Once your AI Discovery Files are published, register your website in the AI Visibility Directory — the verified registry of websites implementing AI Discovery Files. Registration validates your implementation and lists your site for AI systems and industry peers to discover.

Basic Listing

Card entry in the directory with automated file validation. Open to any site with a valid llms.txt file. No cost.

Full Listing Recommended

Dedicated profile page on the directory with dofollow backlinks to your website — a genuine SEO authority signal from a topically relevant, verified source. Includes an attribution badge and enhanced visibility.

Submit Your Website Browse the directory

The Core Principle: Complement, Don't Replace

robots.txt

What robots.txt Does

What It Doesn't Do for AI

What robots-ai.txt Adds

Schema.org Structured Data

What Schema.org Does

What It Doesn't Do for AI

What identity.json Adds

security.txt

What security.txt Does

What It Doesn't Do for AI

What ai.txt Adds

humans.txt

What humans.txt Does

What It Doesn't Do for AI

What brand.txt Adds

ads.txt

What ads.txt Does

What It Doesn't Do for AI

What ai.json Adds

What Only AI Discovery Files Provide

llms.txt

faq-ai.txt

developer-ai.txt

llms.html

Quick Reference

Frequently Asked Questions

Do AI Discovery Files replace robots.txt?

Why not just use Schema.org for AI visibility?

Can I use both AI Discovery Files and existing standards?

What if AI Discovery Files conflict with robots.txt?

Start Implementing

Quick Start Guide

Ecosystem Overview

Check Your Website

Generate AI Discovery Files from your dashboard

Register in the AI Visibility Directory

`robots.txt`

What `robots.txt` Does

What `robots-ai.txt` Adds

What `identity.json` Adds

`security.txt`

What `security.txt` Does

What `ai.txt` Adds

`humans.txt`

What `humans.txt` Does

What `brand.txt` Adds

`ads.txt`

What `ads.txt` Does

What `ai.json` Adds

`llms.txt`

`faq-ai.txt`

`developer-ai.txt`

`llms.html`