- llms.txt is a markdown file at
/llms.txtthat lists your most important content for AI crawlers and human readers consuming the site programmatically. - Spec lives at llmstxt.org. Proposed by Jeremy Howard in September 2024. Adopted by Anthropic, Cloudflare, Stripe, Mintlify, Hugging Face, and many others.
- Not yet a formal W3C standard. Voluntary - each AI engine decides whether to fetch. Adoption is growing fast but uneven.
- Structure: H1 brand name, blockquote summary, sections with H2 headings, markdown links to your key pages. Optional sections at the end.
- Two variants:
/llms.txt(basic, nav-style) and/llms-full.txt(expanded with full content inline). Most brands ship just llms.txt; docs sites often ship both.
Chapter 1. Before you start
llms.txt is the newest entry in the "files at site root that machines read" family,
alongside robots.txt, sitemap.xml, and security.txt.
It exists because AI crawlers and LLM-powered tools need a concise, structured map of
what's important on a site - which a full HTML crawl is bad at producing and a
sitemap.xml doesn't communicate at all.
- Identify the 10-30 most important URLs on your site. These become the body of llms.txt. Don't list everything; the file is meant to be curated.
- Write a one-paragraph summary of what your site is and does. This goes in the blockquote at the top.
- Decide on section structure. Common patterns: by content type (Products / Documentation / Blog), by audience (Customers / Developers / Press), or by topic (Analytics / SEO / Paid Search).
- Confirm you can host plain markdown at the site root. Most platforms (Shopify, WordPress, Webflow, Next.js, static sites) allow file uploads or routes that serve markdown as plain text.
Chapter 2. What is llms.txt and which AI engines actually read it?
llms.txt is a markdown file at /llms.txt proposed by Jeremy
Howard in September 2024. The spec lives at llmstxt.org.
The goal: give AI crawlers and LLM-powered tools a concise, curated entry point to your
site - the equivalent of a navigation menu, written for machines.
| AI engine / system | Reads llms.txt? | How we know |
|---|---|---|
| Anthropic (Claude) | Yes (publishes their own) | anthropic.com/llms.txt exists |
| Mintlify (docs platform) | Yes (autogenerates for customers) | Mintlify-built docs sites ship llms.txt by default |
| Cursor (AI code editor) | Yes (uses llms.txt for project context) | Documented in Cursor's docs |
| ChatGPT | Partial / uncertain | No official statement; some evidence of consumption |
| Perplexity | Partial / uncertain | No official statement |
| Google AI Overviews | No official adoption yet | Google uses regular crawl + structured data |
| Bing / Microsoft Copilot | No official adoption yet | Uses Bing index + structured data |
Adoption is growing but uneven. The cost of shipping llms.txt is small (one file, low maintenance); the upside compounds as more AI tools adopt the standard. Treat it like sitemap.xml in 2003 - early adoption was a hedge, mass adoption became the default.
Chapter 3. Required structure per the llmstxt.org spec
The spec is minimal but specific. Per llmstxt.org, a valid llms.txt has:
- An H1 with the project/brand name. Required, exactly one.
- A blockquote (lines starting with
>) containing a short summary. Required. - Optional additional context as plain paragraphs after the blockquote.
- Zero or more H2 sections, each with a list of markdown links:
- [Link Text](URL): optional description. - An optional "Optional" section at the end (always literally named "Optional"), for secondary links the consumer can skip if context is tight.
Example file (the canonical pattern):
# Example Brand
> Example Brand is a B2C ecommerce store selling smart toothbrushes. Founded
> in 2019. Operates online + in 12 US cities.
This file is for AI crawlers and tools indexing Example Brand. Start with the
sections below for the most important content.
## Products
- [Pro Smart Toothbrush](https://www.example.com/products/pro): Flagship smart toothbrush, $129
- [Travel Smart Toothbrush](https://www.example.com/products/travel): Compact version for travel
- [Toothpaste Capsules](https://www.example.com/products/capsules): Refillable toothpaste capsules
## Documentation
- [Getting Started Guide](https://www.example.com/docs/getting-started): First-time setup
- [App Pairing Guide](https://www.example.com/docs/pairing): How to pair with the mobile app
- [Brushing Modes](https://www.example.com/docs/modes): All seven brushing modes explained
## Company
- [About](https://www.example.com/about): Company history and team
- [Press](https://www.example.com/press): Press coverage and assets
- [Contact](https://www.example.com/contact): Customer service contacts
## Optional
- [Blog](https://www.example.com/blog): Editorial posts on dental care
- [Reviews](https://www.example.com/reviews): Customer reviews and testimonials
Chapter 4. What should you list in llms.txt?
Curate, don't enumerate. llms.txt is meant to highlight your most important content, not catalogue every URL on the site. That's what sitemap.xml is for. Aim for 10-30 links across 3-6 sections.
Strong candidates for inclusion:
- Homepage and primary landing pages. The single most important URL per site.
- Top product or service pages. The pages you want AI engines to cite when asked about your offering.
- Pricing page. AI engines frequently get asked about pricing; having it clearly linked helps them answer accurately.
- Documentation / help center. Where AI engines can find authoritative how-to content for your product.
- About / Company. Entity disambiguation - "who is Example Brand" gets answered cleanly.
- Editorial pillar content. Your highest-authority blog posts or guides.
- Contact info. Customer-service contact, sometimes asked by AI engines.
Weak candidates (don't bloat the file):
- Every individual blog post. Link to the blog index, not 500 individual posts.
- Every product variant. Link to the parent product, not every size/color.
- Internal admin / dashboard URLs. AI engines don't need these.
- Marketing micropages for specific campaigns. They date quickly.
Chapter 5. llms.txt vs llms-full.txt: which do you need?
Two file variants live in the spec. Both are valid; most brands need only one.
| File | Contains | Best for |
|---|---|---|
/llms.txt | Curated links with short descriptions | Almost every site (default choice) |
/llms-full.txt | Full content of each linked page inline | Technical docs sites, API references, single-page-app docs |
For ecommerce stores, blog sites, and most content sites: ship just
/llms.txt. The basic file is small (a few KB) and gives AI engines
the navigation map they need.
For documentation sites, API references, or any site where the entire knowledge base
could fit in an LLM's context window: ship /llms-full.txt alongside the basic
file. Mintlify autogenerates both for the docs sites it hosts; Anthropic publishes both
for their documentation. The full file lets AI tools pull the entire docs corpus in one
request, useful for IDE assistants like Cursor.
Chapter 6. Hosting and validation
Host at the literal path /llms.txt at the site root.
Not nested under /about/, not at a subdomain, not as a redirect from another
location. The spec is explicit: https://example.com/llms.txt.
- Serve as
text/plainortext/markdowncontent type. Most static hosts default to one of these for.txtfiles. - Make sure the file returns HTTP 200, not behind login, not blocked by robots.txt.
- Test with
curl -I https://www.example.com/llms.txtto confirm response code and content type. - Validate the file against the llmstxt.org spec - format expectations are simple but strict (H1 first, blockquote required, section format consistent).
Common hosting paths per platform:
- Shopify: upload as a theme file, route via redirect, or use a third-party app.
- WordPress: upload to
/public_html/via FTP or a file-upload plugin. - Webflow: hosting custom files requires a workaround - usually a redirect to a hosted URL elsewhere.
- Next.js / static sites: drop the file in
/public/and it serves at/llms.txtautomatically. - Custom server: add a route that serves the file with the correct content type.
Chapter 7. The breakages we see most often
Across the small set of sites we've audited running llms.txt (4 of 47 ecommerce sites plus ~20 other public files), the breakages we see most often:
- File served as
text/htmlinstead of plain text: parsers may reject it. Set content type explicitly. - Missing or extra H1: spec requires exactly one H1 with the brand/project name.
- Blockquote missing: spec requires a one-paragraph summary inside
>lines. - Bloated link list: 100+ URLs across no clear sections. Defeats the purpose of curation.
- Outdated links: pages have moved, llms.txt still points at the old URLs. 404s and redirects waste the AI crawler's request.
- Hosted at a non-root path (
/docs/llms.txtinstead of/llms.txt): outside the spec. - llms-full.txt shipped for a brochure site with no docs: produces a giant file no AI engine has reason to consume.
We track llms.txt presence and validity through our Sentry product's AI-search rule set.
FAQ
Is llms.txt required?
No. It's a voluntary spec, not enforced by any standards body. Sites without it work fine; AI engines fall back to standard HTML crawling. llms.txt is an optimization, not a requirement.
Which AI engines actually read llms.txt?
Confirmed: Anthropic publishes their own; Mintlify autogenerates for customer docs sites; Cursor uses it for project context. ChatGPT, Perplexity, Google AI Overviews, and Bing have not publicly adopted it as of May 2026. Adoption is growing fast but uneven.
What's the difference between llms.txt and robots.txt?
robots.txt tells crawlers what they can and cannot access. llms.txt tells AI tools what you'd like them to prioritize. They serve different purposes: robots is access control, llms.txt is a curated navigation hint. Ship both.
Should I list every page on my site in llms.txt?
No. llms.txt is meant to be curated - the 10-30 most important pages, not every URL. That's what sitemap.xml is for. AI engines lose signal when llms.txt is bloated; curation is the point.
Where exactly do I host it?
At the literal path /llms.txt at the site root - so
https://www.example.com/llms.txt. Not nested under another directory, not at
a subdomain, not as a redirect. Serve as text/plain or text/markdown
content type.
How often should I update llms.txt?
When the curated link list changes. For most ecommerce sites that's quarterly at most - new flagship products, retired SKUs, restructured navigation. For documentation sites with frequent content changes, monthly. Auto-generation from your CMS is the most maintainable pattern long-term.
References
- llmstxt.org. "The /llms.txt file." llmstxt.org
- Anthropic. "Example llms.txt." docs.anthropic.com/llms.txt
- Mintlify. "llms.txt and llms-full.txt support." mintlify.com/docs/settings/llmstxt
- Cursor. "Using llms.txt for project context." docs.cursor.com
- Jeremy Howard. "Proposing llms.txt (original September 2024 announcement)." answer.ai/posts/2024-09-03-llmstxt.html