This specification is published and recommended for implementation. Backwards-compatible additions may occur in MINOR versions; breaking changes only in MAJOR versions, with deprecation notice. See specification conventions for status definitions.
llms.html Specification
Human-Readable Presentation of AI Business Identity Information
This specification defines the requirements for llms.html files — HTML documents that present llms.txt content in a human-readable format. The file includes Schema.org structured data for enhanced machine parsing while providing a styled, accessible presentation for human visitors who discover the AI information files.
§1 Overview
What This File Does
The llms.html file provides a human-readable presentation of the information contained in llms.txt. While llms.txt is optimised for AI parsing, llms.html serves human visitors who may discover the AI information files directly.
Why It Matters for AI Visibility
Although primarily designed for human consumption, llms.html serves several AI visibility purposes:
- Provides Schema.org structured data that search engines and AI systems can extract
- Creates a discoverable, indexable page about the business
- Links to the canonical
llms.txtfor AI systems that prefer plain text - Demonstrates transparency about AI-related information the business publishes
Content Authority
The llms.html file is not a separate data source. It is a presentation layer for llms.txt content. The authoritative source for AI systems remains llms.txt.
Content in llms.html must exactly match the information in llms.txt. Any discrepancy SHOULD be corrected in both files simultaneously, with llms.txt as the canonical source.
§2 File Location
Primary Location
The llms.html file SHOULD be placed in the website's root directory:
https://example.com/llms.html
Alternative Location
If the root directory is not suitable, the file MAY be placed at:
https://example.com/ai/llms.html
https://example.com/.well-known/llms.html
URL Requirements
- The file MUST be served with content type
text/html; charset=utf-8 - The URL MUST be accessible without authentication
- HTTPS is strongly recommended
- The file SHOULD NOT be excluded in robots.txt (it SHOULD be indexable)
§3 Format Specification
Document Format
| Property | Requirement |
|---|---|
| Encoding | UTF-8 (required) |
| Document type | HTML5 (required) |
| Language attribute | Must specify language (e.g., lang="en-GB") |
| Viewport meta | Required for mobile compatibility |
HTML Structure
The document MUST follow semantic HTML5 structure:
<!DOCTYPE html>
<html lang="en-GB">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="robots" content="noindex">
<title>AI Information - [Business Name]</title>
<link rel="canonical" href="https://example.com/llms.txt">
<script type="application/ld+json">...</script>
</head>
<body>
...
</body>
</html>
SEO Considerations
The file SHOULD include <meta name="robots" content="noindex"> to prevent duplicate content issues, as the main website pages SHOULD be the primary indexed content. The canonical URL SHOULD point to llms.txt.
§4 Required Elements
Head Section
| Element | Description | Status |
|---|---|---|
<meta charset> |
UTF-8 character encoding declaration | Required |
<meta viewport> |
Responsive viewport configuration | Required |
<title> |
Page title including business name | Required |
<link rel="canonical"> |
Points to llms.txt as canonical version | Required |
| JSON-LD structured data | Schema.org Organization markup | Recommended |
Body Content
| Element | Description | Status |
|---|---|---|
| Notice/disclaimer | Statement that this is a human-readable version of llms.txt | Required |
| Business name (H1) | Primary heading with official business name | Required |
| Description | Business summary matching llms.txt blockquote | Required |
| Services section | List of services with links | Recommended |
| Contact information | Contact details matching llms.txt | Required |
| Link to llms.txt | Visible link to the machine-readable version | Required |
Optional Content
| Element | Description | Status |
|---|---|---|
| Team/Leadership section | Key people with titles (matching llms.txt if present) | Optional |
| Locations section | Office locations or service areas | Optional |
| Certifications section | Industry certifications or accreditations | Optional |
| Additional Schema.org types | LocalBusiness, ProfessionalService, or industry-specific types | Optional |
| Last updated date | Visible indication of when information was last updated | Optional |
| Custom sections | Additional sections matching optional sections in llms.txt | Optional |
Content Not Permitted
The following content types MUST NOT be included in llms.html files:
- Tracking scripts: Do not include analytics, advertising, or tracking JavaScript beyond essential functionality.
- Hidden content: All content SHOULD be visible; do not use CSS to hide text for SEO purposes.
- Duplicate Schema.org types: Use one primary Organization type; avoid conflicting @type declarations.
- Content not in llms.txt: The HTML file MUST reflect llms.txt content; do not add information not in the canonical source.
- Interactive elements: Forms, login prompts, or dynamic content that could break or confuse AI systems.
- Advertising: The file SHOULD be purely informational; no promotional banners or ads.
- External resources that may break: Avoid depending on third-party resources that could become unavailable.
§5 Schema.org Structured Data
Required Markup
The file SHOULD include JSON-LD structured data with Schema.org Organization vocabulary:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Horizon Strategic Consulting",
"alternateName": "Horizon Consulting",
"url": "https://www.horizonconsulting.example",
"description": "UK-headquartered management consultancy...",
"foundingDate": "2012-03-15",
"address": {
"@type": "PostalAddress",
"streetAddress": "45 Deansgate",
"addressLocality": "Manchester",
"postalCode": "M3 2BA",
"addressCountry": "GB"
},
"contactPoint": {
"@type": "ContactPoint",
"email": "hello@horizonconsulting.example",
"telephone": "+44 161 555 0123",
"contactType": "General Enquiries"
},
"areaServed": ["GB", "IE", "NL", "BE"]
}
</script>
Alignment with identity.json
The Schema.org structured data in llms.html SHOULD align with the data in identity.json. If both files exist, they SHOULD contain consistent information:
- Business name MUST match exactly
- URL MUST be identical
- Contact information MUST be consistent
- Geographic scope SHOULD align
§6 Relationship to Other Files
File Hierarchy
| File | Role | Relationship |
|---|---|---|
llms.txt |
Canonical source | llms.html presents this content |
llms.html |
Presentation layer | Human-readable; links to llms.txt |
identity.json |
Structured identity | Schema.org data SHOULD align |
Linking Requirements
The llms.html file MUST include:
- A
<link rel="canonical">pointing tollms.txt - A visible link to
llms.txtin the page content or footer - Links to other AI discovery files if they exist (ai.txt, identity.json, etc.)
For complete conflict resolution rules, see the Interoperability Guide.
§7 Canonical Example
A complete canonical example for Horizon Strategic Consulting is available:
Key Features Demonstrated
- Proper HTML5 document structure
- Canonical link to llms.txt
- Schema.org Organization structured data
- Notice about human-readable version
- All sections from llms.txt represented
- Links to other AI discovery files
- Footer with link to machine-readable version
§8 Implementation Notes
Best Practices
- Keep styling minimal and professional
- Ensure the page is accessible (proper heading hierarchy, alt text, keyboard navigation)
- Use semantic HTML elements (
<article>,<section>,<nav>) - Include a clear notice that this is an AI information page
- Test structured data with Google's Rich Results Test
Accessibility Requirements
- Proper heading hierarchy (single H1, logical H2/H3 structure)
- Sufficient colour contrast (WCAG 2.1 AA minimum)
- Keyboard-navigable interface
- Descriptive link text
Synchronisation
When updating business information:
- Update
llms.txtfirst (canonical source) - Update
llms.htmlto match - Verify Schema.org data is consistent
- Update last modified date if displayed
§9 Machine-Readable Formats
This specification is available in machine-readable formats for programmatic access:
§10 Version History
Phase 6 standardisation release. Added /specifications/roadmap/ (theme-pegged forward plan with Active/Next/Future/On hold status flags), /specifications/extensions/ (rules for experimental x- prefixed files and the promotion path), and /specifications/i18n-a11y/ (multi-language publication, locale-tagged identity fields, RTL handling, accessibility of llms.html). Added the Discovery: directive to the robots-ai.txt specification (publishers MAY advertise AI Discovery Files on the same host). Added a formal media-type stance to the HTTP behaviour page (existing IANA types, no bespoke registrations). Expanded the file integrity and signing section on the security and privacy page with four candidate mechanisms, cross-cutting concerns, and interim publisher / consumer guidance. The Discovery: directive is the only normative addition to publisher behaviour; all other additions are forward-looking documentation.
Phase 5 standardisation release. Added /specifications/related-standards/ (positioning vs llmstxt.org, IETF AI Preferences, robots.txt, Schema.org, BCP 14, JSON Schema 2020-12, SemVer) and /specifications/implementations/ (public record of conformant implementations, IETF-style). Added an explicit llmstxt.org backward-compatibility statement to the llms.txt specification. Added a formal multi-domain and subdomain scoping rule to both the llms.txt and identity.json specifications (host-scoped files, cross-host identity asserted via sameAs). No normative requirements changed for existing publishers; the new scoping rules formalise behaviour the specification already implied.
Phase 4 standardisation release. Added /specifications/processing-model/ (seven-stage algorithm for conformant consumers), /specifications/consumer-guidance/ (what AI systems should do with AI Discovery Files), /specifications/test-vectors/ (canonical test suite framing), and reference-implementation framing on the AI Visibility Checker. No normative requirements changed.
Phase 3 standardisation release. Added /specifications/versioning/ (Semantic Versioning 2.0.0 commitments, deprecation timeline, lifecycle), /specifications/governance/ (proposal lifecycle, editorial process, working principles), /specifications/security-privacy/ (trust model, content-injection patterns, GDPR considerations, integrity primitives roadmap), and /specifications/http-behaviour/ (status codes, redirects, soft-404 detection, caching, rate limits). No normative requirements changed.
Phase 2 standardisation release. Added formal conformance specification (Essential / Recommended / Complete classes). Published machine-readable registry at /specifications/registry.json, spec meta-schema, and validator-output schema. Introduced versioned JSON Schema URLs (/v1/) alongside unversioned 'latest' aliases. Added optional BCP 47 language declaration field across all applicable AI Discovery Files. No normative requirements changed.
Phase 1 standardisation release. Added 'Status of This Document' block (Stable). Normalised normative requirement keywords to uppercase per RFC 2119 and RFC 8174. Added References section linking to /specifications/conventions/ and /licensing/. No normative requirements changed.
Added AI Visibility Directory registration guidance. Minor documentation update.
Added "Optional Content" section with specific examples. Added "Content Not Permitted" section clarifying what MUST NOT be included in llms.html files.
Initial publication. Defines requirements for human-readable HTML presentation of llms.txt content with Schema.org structured data.
Conformance
This file is required for the Complete conformance class only. A publisher claiming Complete conformance MUST publish a valid version of this file at the website's root. The Essential and Recommended classes do not require this file.
See the Conformance specification for full publisher and validator conformance criteria, including identity-consistency requirements across files and the relationship between self-declaration and Directory verification.
References
- Specification Conventions — RFC 2119 + RFC 8174 requirement keywords, document statuses, anchor naming, versioning, and language conventions used across every AI Discovery File specification.
- Licensing & Trademark — CC BY 4.0 for specification text and examples, MIT for JSON Schemas, and the free-use policy on the name "AI Discovery Files".