If you manage a website with documentation, product pages, or technical content, there's a file you probably haven't created yet-but AI models are already looking for it. It's called llms-full.txt, and it represents a fundamental shift in how content gets discovered and consumed online.
In some setups, you can provide llms-full.txt, which expands on llms.txt with a more exhaustive list of ingest-worthy pages and their content. Where llms.txt points to sources, llms-full.txt contains the entire content of a website's documentation in a single Markdown file. That distinction matters. While an llms.txt file acts as a curated table of contents, the full version gives AI systems everything they need in one request-no additional fetching, no HTML parsing, no wasted tokens on navigation menus and cookie banners.
Data from Profound reveals something unexpected: LLMs are accessing llms-full.txt even more frequently than the original llms.txt. That signal alone should change how you prioritize your implementation. This guide walks through exactly how to build one that works.
What llms-full.txt Actually Is (and Why It Exists)
The llms.txt standard was proposed by Jeremy Howard of Answer.AI in September 2024, aiming to make more efficient use of web content by artificial intelligence systems. The core problem it solves is straightforward: large language models increasingly rely on website information, but face a critical limitation-context windows are too small to handle most websites in their entirety, and converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise.
The specification actually defines two complementary files. llms.txt provides a streamlined view of your documentation navigation to help AI systems quickly understand your site's structure, while llms-full.txt is a comprehensive file containing all your documentation in one place.
Think of the relationship this way: if llms.txt is the executive brief, llms-full.txt is the full-length book. It contains flattened, chunked text from your most referenced documentation-everything required for AI to pull the most accurate, context-rich details.
Because it strips out JavaScript, navigation menus, and cookie banners, it ensures that retrieval is clean and deterministic.
The Origin Story That Explains the Design
Mintlify originally developed llms-full.txt in a collaboration with Anthropic, who needed a cleaner way to feed their entire documentation into LLMs without parsing HTML. After seeing its impact, they rolled it out for all customers, and it was officially adopted into the llmstxt.org standard.
That history matters because it reveals the file's primary purpose: serving as a single ingestion point for AI systems that need complete context. A language model can retrieve your entire documentation corpus in a single HTTP request-ideal for RAG pipelines, AI-powered support bots, and coding assistants that need context-window-friendly snapshots of your knowledge base.
The Honest Case: What the Data Shows (and Doesn't)
Before you invest time building this file, you deserve a clear-eyed view of the evidence. The landscape is genuinely mixed. On the adoption side, ProGEO.ai published research in March 2026 measuring Fortune 500 adoption rates of llms.txt, JSON-LD, and AI directives in robots.txt-three signals closely associated with GEO maturity. Their finding: only 7.4% of the Fortune 500 have implemented llms.txt. A broader study from SE Ranking tells a similar story: in their dataset of nearly 300,000 domains, just 10.13% had an llms.txt file in place-a long way from the universal adoption of standards like robots.txt or sitemaps.
On the effectiveness question, the data is even more sobering. Both statistical analysis and machine learning showed no effect of llms.txt on how often a domain is cited by LLMs. Removing this variable from the XGBoost model actually improved its accuracy.
And yet the counter-evidence is compelling. ChatGPT now refers around 10% of new Vercel signups.
Anthropic, creator of Claude, specifically asked Mintlify to implement llms.txt and llms-full.txt for their documentation.
Google included a llms.txt file in their new Agents to Agents (A2A) protocol.
The nuanced takeaway: llms-full.txt probably won't boost your AI citations tomorrow. But for documentation-heavy products, this is where the standard actually shines-if you maintain API docs, SDKs, or technical guides, llms.txt provides a token-efficient entry point for LLMs. The file's value lies less in search visibility and more in content accuracy when AI systems reference your material.
Anatomy of an Effective llms-full.txt File
The llms-full.txt file should follow a logical structure that preserves the organization of your llms.txt file while incorporating the full content of each linked document. Here's the structural skeleton:
- H1 heading with your project or site name
- Blockquote summary of the project
- H2 sections matching the categories from your llms.txt
- H3 headings for individual documents within each section
- Full Markdown content of each document beneath its heading
- Horizontal rules (
---) separating individual documents
llms-full.txt follows the same Markdown-based format as llms.txt, but each linked resource is followed by its full page content rather than just a description.
Why Markdown and Not HTML
This isn't an arbitrary formatting choice. Tokens are the currency of LLMs. Every word, punctuation mark, or formatting tag can cost valuable tokens in a prompt. Markdown is lighter than JSON, XML, or HTML-it conveys meaning with fewer characters.
Platforms like Fern report that serving Markdown instead of HTML reduces token consumption by 90%+.
A typical documentation page might contain 20% actual content and 80% navigation, styling, scripts, and other elements. Your llms-full.txt file eliminates that waste entirely.
File Size and Context Window Considerations
There's a practical ceiling to think about. AI crawlers have soft limits on file size-files over 100KB are often partially indexed or deprioritized. That 100KB guidance applies to the llms.txt file itself. The llms-full.txt file will naturally be larger since it contains full content, but you still need to be strategic.
Context windows are expanding rapidly-for example, Gemini 1.5 Pro has 1 million tokens and Claude 3.5 Sonnet has 200,000. But bigger isn't always better. Context window size isn't the real constraint; focus is. Models perform better with curated information than with comprehensive dumps.
A practical rule: aim for a file that covers your most essential documentation comprehensively rather than cramming every page you've ever published. If space is a concern, focus on including your most critical and frequently accessed content.
Step-by-Step: Building Your llms-full.txt File
Step 1: Audit and Prioritize Your Content
Before writing a single line, decide what belongs in the file. Start by identifying the resources that really matter-quickstart guides, authentication docs, API references, SDKs, pricing pages, security policies, SLAs, and the top 10 most common support questions.
Not everything deserves inclusion. An effective file is not a dump of your entire site. It should resemble a smart table of contents: identification and promise via the summary, then priority content including pillar pages, categories, documentation, and proof pages.
For an e-commerce site, in your llms.txt you'd probably only want to link to the return policy and explain that product pages exist. In the llms-full.txt you would explain how to use the search function and the checkout process.
Step 2: Convert Content to Clean Markdown
Start by gathering the full content of all documents referenced in your llms.txt file. For each link, access the document content, convert it to Markdown format if it's not already, and organize it according to the section structure of your llms.txt file.
Several tools handle HTML-to-Markdown conversion:
- Pandoc: Command-line tool for batch converting between document formats
- Firecrawl:
You can generate both llms.txt and llms-full.txt files through the web interface or via API
- Mintlify's free generator: Paste a URL and get a formatted starter file
- Wetrocloud's Website to Markdown Converter: Designed specifically for LLM-optimized output
If you're on WordPress, Yoast SEO simplifies the process by generating and managing the file for you, with one-click activation from settings and weekly regeneration using WordPress cron jobs.
Step 3: Assemble the File Structure
The structure should follow: project or site name as H1, brief description as blockquote, H2 section names, H3 document titles, with full content of each document beneath.
Keep three formatting principles in mind: - Maintain consistent formatting-use a consistent format for section headings, document titles, and source information.
- Include source URLs-always include the original URL for each document to provide proper attribution and reference.
- Preserve document structure-maintain the original heading structure and formatting of each document as much as possible.
Step 4: Host and Validate
Serve both files from the root of your primary domain. For documentation subdomains, publish a copy there too. Set the Content-Type header to text/plain; charset=utf-8. Enable compression (gzip/Brotli) on your server-large files benefit significantly.
Both files should be at the root of your domain, served with the correct MIME types, and your llms.txt should include a link to llms-full.txt.
Validation matters. Test that the file returns HTTP 200, uses correct encoding, and is not unexpectedly blocked by your CDN. Then go further: run a tool that expands your llms.txt file into an LLM context file and test a number of language models to see if they can answer questions about your content.
Platform-Specific Implementation Paths
The tooling ecosystem has matured significantly. Your approach depends on your stack. Documentation platforms with auto-generation: - Fern automatically generates token-optimized llms.txt and llms-full.txt files whenever your documentation changes, ensuring AI agents always receive current context.
- Mintlify automatically generates and hosts /llms.txt, /llms-full.txt, and .md versions of all pages for LLM optimization.
- GitBook added llms.txt support in January 2025, with llms-full.txt and .md page support following in June 2025.
Static site generators with plugins: - sphinx-llms-txt generates llms.txt with zero configuration. Read the Docs also serves llms.txt and llms-full.txt from the default version of your project automatically.
- Eleventy has eleventy-plugin-llms-txt. Gatsby has gatsby-plugin-llms-txt.
WordPress: - Yoast SEO includes the 5 latest updated posts/pages/custom post types in the llms.txt file, with priority given to cornerstone content first.
- The Website LLMs.txt plugin (30,000+ installs) auto-generates files and tracks if GPTBot or ClaudeBot accesses them.
Custom stacks: - On modern stacks like Next.js, Nuxt, or SvelteKit, you can generate the file from your headless CMS or content catalogue, then expose it as a static asset.
Advanced Techniques: Getting More From Your File
Segment by Product Area
For multi-product companies, a single monolithic file may not serve you well. You can create separate files for different product areas-organizing by API navigation, complete API docs, tutorial navigation, and complete tutorials at their respective subpaths.
Both files are available at any level of your documentation hierarchy-/llms.txt, /llms-full.txt, /docs/llms.txt, /docs/ai-features/llms-full.txt, and so on.
Filter by Language or Specification
You can filter llms.txt and llms-full.txt output with query parameters like lang and excludeSpec to reduce token usage. Fern, for example, lets users get a clean, language-specific output they can feed to AI tools when writing code, and even add a dropdown in the navbar linking to different filtered versions of llms-full.txt.
Control What AI Sees vs. What Humans See
Some platforms now offer granular content controls. Within pages, use <llms-only> and <llms-ignore> tags to control what content is exposed to AI versus human readers on your documentation site.
The <llms-only> tag shows content to AI but hides it from human readers, useful for technical context that's verbose but helpful for AI, like implementation details or architecture notes.
Integrate With Your CI/CD Pipeline
Set up CI/CD pipelines to automatically update both files when documentation changes. This is non-negotiable for any team that updates docs more than monthly. Stale content is worse than no file. Aim to regenerate llms-full.txt whenever significant documentation changes are published.
Common Mistakes That Undermine Your File
Treating it as a duplicate sitemap. Keep it at 20-50 links max. More isn't curation-it's dumping. Your llms-full.txt is the place for depth, not breadth across every URL you've ever published. Including gated content. If your site requires authentication, llms.txt and llms-full.txt also require authentication to view. LLMs and AI tools that cannot authenticate into your site cannot access these files. Listing pages behind login walls wastes space and confuses AI systems. Forgetting security implications. llms-full.txt files consolidate all your documentation, potentially exposing information you didn't intend to make easily accessible.
If an attacker gains write access, they could inject malicious instructions or misleading content. Treat it as a security-sensitive file: automate generation, require code review, and monitor for changes.
Never updating the file. Maintenance is essential: an outdated file quickly loses value.
Set a cadence-monthly, or with each strategic content release-and add a change log in Git or internal documentation to track why a page was added or removed.
Creating unnecessary duplicate content. A few SEOs have started creating markdown copies of every blog article as .md files, then linking all these from their llms.txt. This approach creates unnecessary duplicate content without clear benefits.
How to Measure Whether It's Working
You can't manage what you can't measure, and measurement here remains imperfect. A few approaches worth combining: Check server logs. Check server logs for requests to /llms.txt. You'll see which bots-PerplexityBot, GPTBot, and others-are fetching it and how often.
Some platforms like Fern offer built-in analytics that track traffic by LLM provider and break down bot versus human visitors at the page level.
Test with actual AI tools. Provide the file content to your AI system by pasting the link, copying the file contents directly into your prompt, or using the AI tool's file upload feature. Go to your llms-full.txt URL, copy the contents or URL into your chat, and ask specific questions. If the model can accurately answer detailed questions about your product using only that file, it's working as intended. Monitor AI referral traffic. Track visits from chat.openai.com, perplexity.ai, and claude.ai in your analytics. Attribution isn't always clean, but traffic from AI referrers often reflects users who've already asked a question, seen an answer, and are now acting on it.
The honest assessment: implementation costs are low-2-8 hours-with uncertain but potentially high future ROI. For documentation-heavy sites, API platforms, and developer tools, that bet makes strategic sense. For a local bakery or small service business, you'll get more value from Google Business Profile optimization and local SEO fundamentals than from a documentation manifest file.
The web has two audiences now. Humans read your pages. AI systems read your structured files. Building an effective llms-full.txt file isn't about chasing a ranking signal that may or may not exist-it's about ensuring that when an AI model does reference your content, it gets the full, accurate picture. Your audience now includes LLMs alongside humans, and optimizing for AI isn't about gaming a system-it's about ensuring your content is accurately represented.
Start with your most valuable documentation. Convert it to clean Markdown. Structure it logically. Automate the updates. Then test whether AI models can actually use it to answer real questions about your product. That last step is the one most teams skip-and it's the one that separates a file that sits harmlessly in your root directory from one that actively improves how your brand shows up in the age of AI-mediated discovery.
Ready to optimize for the AI era?
Get a free AEO audit and discover how your brand shows up in AI-powered search.
Get Your Free Audit