Meet Cortex AI Powered, Expertise Refined Decision EngineYour AI Optimization Engine
AI Search

How to Create an llms.txt File in 2026

Seven chapters covering what llms.txt is, the llmstxt.org spec, what to include, basic vs llms-full.txt, hosting at the root, validation, and the breakages we see most often.

Jacque Bichara
Jacque Bichara
Founder & Lead Strategist, Capconvert
May 20, 2026 Updated May 20, 2026 12 min read Reviewed by {{REVIEWER_NAME}}, {{REVIEWER_CREDENTIAL}} on May 20, 2026
Who this is for Marketers and developers shipping their first llms.txt for AI engine visibility, or auditing an existing file against the current llmstxt.org spec.
TL;DR
  • llms.txt is a markdown file at /llms.txt that lists your most important content for AI crawlers and human readers consuming the site programmatically.
  • Spec lives at llmstxt.org. Proposed by Jeremy Howard in September 2024. Adopted by Anthropic, Cloudflare, Stripe, Mintlify, Hugging Face, and many others.
  • Not yet a formal W3C standard. Voluntary - each AI engine decides whether to fetch. Adoption is growing fast but uneven.
  • Structure: H1 brand name, blockquote summary, sections with H2 headings, markdown links to your key pages. Optional sections at the end.
  • Two variants: /llms.txt (basic, nav-style) and /llms-full.txt (expanded with full content inline). Most brands ship just llms.txt; docs sites often ship both.

Chapter 1. Before you start

llms.txt is the newest entry in the "files at site root that machines read" family, alongside robots.txt, sitemap.xml, and security.txt. It exists because AI crawlers and LLM-powered tools need a concise, structured map of what's important on a site - which a full HTML crawl is bad at producing and a sitemap.xml doesn't communicate at all.

  • Identify the 10-30 most important URLs on your site. These become the body of llms.txt. Don't list everything; the file is meant to be curated.
  • Write a one-paragraph summary of what your site is and does. This goes in the blockquote at the top.
  • Decide on section structure. Common patterns: by content type (Products / Documentation / Blog), by audience (Customers / Developers / Press), or by topic (Analytics / SEO / Paid Search).
  • Confirm you can host plain markdown at the site root. Most platforms (Shopify, WordPress, Webflow, Next.js, static sites) allow file uploads or routes that serve markdown as plain text.
From the audit notes
Of 47 ecommerce storefronts we audited over the past 24 months, only 4 shipped llms.txt files - an early-adopter signal. Of those 4, all were tech-forward DTC brands that also had the highest AI-search citation rates in the dataset. Causation is unproven; the correlation is consistent enough that we now ship llms.txt by default on every new client engagement.

Chapter 2. What is llms.txt and which AI engines actually read it?

llms.txt is a markdown file at /llms.txt proposed by Jeremy Howard in September 2024. The spec lives at llmstxt.org. The goal: give AI crawlers and LLM-powered tools a concise, curated entry point to your site - the equivalent of a navigation menu, written for machines.

AI engine / systemReads llms.txt?How we know
Anthropic (Claude)Yes (publishes their own)anthropic.com/llms.txt exists
Mintlify (docs platform)Yes (autogenerates for customers)Mintlify-built docs sites ship llms.txt by default
Cursor (AI code editor)Yes (uses llms.txt for project context)Documented in Cursor's docs
ChatGPTPartial / uncertainNo official statement; some evidence of consumption
PerplexityPartial / uncertainNo official statement
Google AI OverviewsNo official adoption yetGoogle uses regular crawl + structured data
Bing / Microsoft CopilotNo official adoption yetUses Bing index + structured data

Adoption is growing but uneven. The cost of shipping llms.txt is small (one file, low maintenance); the upside compounds as more AI tools adopt the standard. Treat it like sitemap.xml in 2003 - early adoption was a hedge, mass adoption became the default.

Chapter 3. Required structure per the llmstxt.org spec

The spec is minimal but specific. Per llmstxt.org, a valid llms.txt has:

  1. An H1 with the project/brand name. Required, exactly one.
  2. A blockquote (lines starting with >) containing a short summary. Required.
  3. Optional additional context as plain paragraphs after the blockquote.
  4. Zero or more H2 sections, each with a list of markdown links: - [Link Text](URL): optional description.
  5. An optional "Optional" section at the end (always literally named "Optional"), for secondary links the consumer can skip if context is tight.

Example file (the canonical pattern):

# Example Brand

> Example Brand is a B2C ecommerce store selling smart toothbrushes. Founded
> in 2019. Operates online + in 12 US cities.

This file is for AI crawlers and tools indexing Example Brand. Start with the
sections below for the most important content.

## Products

- [Pro Smart Toothbrush](https://www.example.com/products/pro): Flagship smart toothbrush, $129
- [Travel Smart Toothbrush](https://www.example.com/products/travel): Compact version for travel
- [Toothpaste Capsules](https://www.example.com/products/capsules): Refillable toothpaste capsules

## Documentation

- [Getting Started Guide](https://www.example.com/docs/getting-started): First-time setup
- [App Pairing Guide](https://www.example.com/docs/pairing): How to pair with the mobile app
- [Brushing Modes](https://www.example.com/docs/modes): All seven brushing modes explained

## Company

- [About](https://www.example.com/about): Company history and team
- [Press](https://www.example.com/press): Press coverage and assets
- [Contact](https://www.example.com/contact): Customer service contacts

## Optional

- [Blog](https://www.example.com/blog): Editorial posts on dental care
- [Reviews](https://www.example.com/reviews): Customer reviews and testimonials

Chapter 4. What should you list in llms.txt?

Curate, don't enumerate. llms.txt is meant to highlight your most important content, not catalogue every URL on the site. That's what sitemap.xml is for. Aim for 10-30 links across 3-6 sections.

Strong candidates for inclusion:

  • Homepage and primary landing pages. The single most important URL per site.
  • Top product or service pages. The pages you want AI engines to cite when asked about your offering.
  • Pricing page. AI engines frequently get asked about pricing; having it clearly linked helps them answer accurately.
  • Documentation / help center. Where AI engines can find authoritative how-to content for your product.
  • About / Company. Entity disambiguation - "who is Example Brand" gets answered cleanly.
  • Editorial pillar content. Your highest-authority blog posts or guides.
  • Contact info. Customer-service contact, sometimes asked by AI engines.

Weak candidates (don't bloat the file):

  • Every individual blog post. Link to the blog index, not 500 individual posts.
  • Every product variant. Link to the parent product, not every size/color.
  • Internal admin / dashboard URLs. AI engines don't need these.
  • Marketing micropages for specific campaigns. They date quickly.

Chapter 5. llms.txt vs llms-full.txt: which do you need?

Two file variants live in the spec. Both are valid; most brands need only one.

FileContainsBest for
/llms.txtCurated links with short descriptionsAlmost every site (default choice)
/llms-full.txtFull content of each linked page inlineTechnical docs sites, API references, single-page-app docs

For ecommerce stores, blog sites, and most content sites: ship just /llms.txt. The basic file is small (a few KB) and gives AI engines the navigation map they need.

For documentation sites, API references, or any site where the entire knowledge base could fit in an LLM's context window: ship /llms-full.txt alongside the basic file. Mintlify autogenerates both for the docs sites it hosts; Anthropic publishes both for their documentation. The full file lets AI tools pull the entire docs corpus in one request, useful for IDE assistants like Cursor.

Chapter 6. Hosting and validation

Host at the literal path /llms.txt at the site root. Not nested under /about/, not at a subdomain, not as a redirect from another location. The spec is explicit: https://example.com/llms.txt.

  1. Serve as text/plain or text/markdown content type. Most static hosts default to one of these for .txt files.
  2. Make sure the file returns HTTP 200, not behind login, not blocked by robots.txt.
  3. Test with curl -I https://www.example.com/llms.txt to confirm response code and content type.
  4. Validate the file against the llmstxt.org spec - format expectations are simple but strict (H1 first, blockquote required, section format consistent).

Common hosting paths per platform:

  • Shopify: upload as a theme file, route via redirect, or use a third-party app.
  • WordPress: upload to /public_html/ via FTP or a file-upload plugin.
  • Webflow: hosting custom files requires a workaround - usually a redirect to a hosted URL elsewhere.
  • Next.js / static sites: drop the file in /public/ and it serves at /llms.txt automatically.
  • Custom server: add a route that serves the file with the correct content type.

Chapter 7. The breakages we see most often

Across the small set of sites we've audited running llms.txt (4 of 47 ecommerce sites plus ~20 other public files), the breakages we see most often:

  • File served as text/html instead of plain text: parsers may reject it. Set content type explicitly.
  • Missing or extra H1: spec requires exactly one H1 with the brand/project name.
  • Blockquote missing: spec requires a one-paragraph summary inside > lines.
  • Bloated link list: 100+ URLs across no clear sections. Defeats the purpose of curation.
  • Outdated links: pages have moved, llms.txt still points at the old URLs. 404s and redirects waste the AI crawler's request.
  • Hosted at a non-root path (/docs/llms.txt instead of /llms.txt): outside the spec.
  • llms-full.txt shipped for a brochure site with no docs: produces a giant file no AI engine has reason to consume.

We track llms.txt presence and validity through our Sentry product's AI-search rule set.

FAQ

Is llms.txt required?

No. It's a voluntary spec, not enforced by any standards body. Sites without it work fine; AI engines fall back to standard HTML crawling. llms.txt is an optimization, not a requirement.

Which AI engines actually read llms.txt?

Confirmed: Anthropic publishes their own; Mintlify autogenerates for customer docs sites; Cursor uses it for project context. ChatGPT, Perplexity, Google AI Overviews, and Bing have not publicly adopted it as of May 2026. Adoption is growing fast but uneven.

What's the difference between llms.txt and robots.txt?

robots.txt tells crawlers what they can and cannot access. llms.txt tells AI tools what you'd like them to prioritize. They serve different purposes: robots is access control, llms.txt is a curated navigation hint. Ship both.

Should I list every page on my site in llms.txt?

No. llms.txt is meant to be curated - the 10-30 most important pages, not every URL. That's what sitemap.xml is for. AI engines lose signal when llms.txt is bloated; curation is the point.

Where exactly do I host it?

At the literal path /llms.txt at the site root - so https://www.example.com/llms.txt. Not nested under another directory, not at a subdomain, not as a redirect. Serve as text/plain or text/markdown content type.

How often should I update llms.txt?

When the curated link list changes. For most ecommerce sites that's quarterly at most - new flagship products, retired SKUs, restructured navigation. For documentation sites with frequent content changes, monthly. Auto-generation from your CMS is the most maintainable pattern long-term.

References

  1. llmstxt.org. "The /llms.txt file." llmstxt.org
  2. Anthropic. "Example llms.txt." docs.anthropic.com/llms.txt
  3. Mintlify. "llms.txt and llms-full.txt support." mintlify.com/docs/settings/llmstxt
  4. Cursor. "Using llms.txt for project context." docs.cursor.com
  5. Jeremy Howard. "Proposing llms.txt (original September 2024 announcement)." answer.ai/posts/2024-09-03-llmstxt.html