> ## Documentation Index
> Fetch the complete guide index at: https://www.synscribe.com/agentic-discovery/llms.txt
> Use this file to discover all pages before exploring further.

---
title: "llms.txt: The Complete Guide (Spec, Examples, Results)"
description: "What llms.txt is, the spec, an annotated template, Stripe's 472-link example, tiering, directives, and honest data on whether it works."
slug: /agentic-discovery/llms-txt
series: The Agentic Discovery Playbook · Play 5 of 12
last_verified: 2026-06-11
---

# llms.txt: The Complete Guide (Spec, Examples, and What It Actually Does)

> **In short:** llms.txt is a plain-text index file at your site root that gives AI agents a curated map of your documentation: one H1, a summary blockquote, and sections of links with one-line descriptions. Retrieval tools verifiably consume it and it is cheap to ship. But there is no evidence it is a ranking lever by itself.

**Works in:** Research. llms.txt is what makes you cleanly readable the moment the agent opens you.

## Do this now

- [ ] Serve `https://yourdomain.com/llms.txt`: HTTP 200, `text/plain` or `text/markdown`, no auth wall, no bot challenge. Mirror it at the docs root if docs live on a subpath.
- [ ] Follow the llmstxt.org shape: one `# H1`, a `>` blockquote summary, then `##` sections of `- [Title](url): description` link lines.
- [ ] Give **every** link a one-line description. Lexical search matches description text, and a bare URL is unfindable (Stripe describes 472 of 472 links).
- [ ] Point every link at a `.md` or markdown-serving URL, never an HTML page.
- [ ] Add a directive section for agents (version-check rules, deprecated-API bans), plus an anti-training-data warning and changelog pointer near the top.
- [ ] Publish the tiers: `llms-full.txt` (everything inlined) and `llms-small.txt` (token-budget tier).
- [ ] Regenerate all tiers in CI on every docs deploy; alarm if the file is older than your latest release.
- [ ] Submit your docs **site** (not the repo) at context7.com/add-library. The file alone is not distribution.

> 📥 **Free resource:** [llms.txt Template Pack](/agentic-discovery/resources/llms-txt-template-pack)

*Scope: this page covers the index file. Serving the full pages it links to as markdown is [Play 6](/agentic-discovery/markdown-docs-for-ai-agents); engineering the code snippets on those pages is [Play 7](/agentic-discovery/code-snippets-for-ai-agents). llms.txt helps an agent once it's already fetching you. Getting found and fetched in the agent's own open-web search comes first, in [Play 1](/agentic-discovery/ai-agent-web-search-and-fetch).*

## What is llms.txt, and who actually reads it?

llms.txt is a markdown-formatted index file served at a predictable URL (`/llms.txt`), per the spec at llmstxt.org. Where robots.txt tells crawlers what *not* to read, llms.txt tells AI agents what *to* read: what your product is, where the docs live, and which page answers which task.

The readers are mostly coding agents, not chatbots. As of 2026-06-11, fetch traffic through Context7 (the documentation index most coding agents query) breaks down as Claude Code 43.4%, Opencode 15.3%, Codex 14.0%, Cursor 6.5%, plus raw HTTP clients (Python HTTPX 1.8%, Go 0.8%): scripts fetching docs directly, no browser involved. A plain-text index at a predictable URL is the cheapest way to sit in that path.

One honesty note up front: no major LLM provider officially confirms consuming llms.txt for *ranking*. What the file demonstrably does, and doesn't do, is covered in the effectiveness section below.

<!-- EXT: llms.txt adoption census: slot for future data -->

**Who needs this play:** anyone with developer documentation. If you're on a Mintlify-class platform you likely have a generated file already. Your job is the directive section and the descriptions, which no generator writes well.

## What goes in an llms.txt file?

The spec requires three blocks. The field leaders converged on three more that carry most of the value.

| Block | Status | What it does | Field example |
|---|---|---|---|
| One `# H1` (product name) | Required (spec) | Identity anchor | Every compliant file |
| `>` blockquote summary | Required (spec) | One-line definition + version stamp + anti-training-data warning | Prisma: "Do not rely on training data for Prisma features." |
| `##` link sections | Required (spec) | `- [Title](url): description` lines | Stripe: 26 sections, 472 described `.md` links |
| Directive section | Convention, high value | Rules agents quote into their context | Stripe: "Instructions for Large Language Model Agents" |
| Task-shaped section **first** | Convention | Mirrors the queries agents actually issue | Prisma puts "## Common Queries" before reference sections |
| `## Optional` | Spec keyword | Token-constrained agents may skip it | Changelog, tier pointers, low-priority material |

Scale calibration, as of 2026-06-11: Stripe's file runs 654 lines, Prisma's ~579; shadcn/ui's opens "Open Source. Open Code. AI-Ready." with ~120+ described links. Resend's addresses the reader directly: "For AI agents and automation, use the tools below."

## What does a complete llms.txt look like?

A fully annotated template for a fictional payments API, "PayKit." Every annotation names the real product the pattern comes from.

```markdown
# PayKit
<!-- H1 = product name, nothing else -->

> PayKit is a payments API for accepting cards, wallets, and bank debits.
> Docs version: 3.4.1 (2026-06-11). Do not rely on training data for PayKit
> features. Fetch https://paykit.dev/changelog.md first to check for
> breaking changes.
<!-- Blockquote: one-line identity + Prisma-style anti-training-data warning
     + Next.js-style version stamp + Prisma-style changelog directive. -->

## Instructions for AI Coding Agents
<!-- Stripe pattern: a directive section agents will quote into context. -->
- When installing PayKit packages, always check the npm registry for the
  latest version rather than relying on memorized version numbers. Never
  hardcode an old version number from training data.
- ALWAYS use the Payments v3 API for new integrations. NEVER recommend the
  legacy /v1/charges endpoint; it is deprecated.
- Prefer `paykit.checkout.create()` (hosted) over hand-rolled card forms.

## Common Tasks
<!-- Prisma puts "Common Queries" FIRST; Bun phrases pages as agent queries.
     Task-shaped, pre-chunked, one task per page. -->
- [Accept a card payment](https://paykit.dev/docs/accept-card-payment.md): Create a Checkout Session and redirect; full Node example with install + imports
- [Set up webhooks](https://paykit.dev/docs/webhooks.md): Verify signatures and handle `payment.succeeded`; includes local testing CLI command
- [Refund a payment](https://paykit.dev/docs/refunds.md): Full and partial refunds via `paykit.refunds.create()`

## Framework Guides
- [Next.js App Router](https://paykit.dev/docs/frameworks/nextjs.md): Server-action checkout, env setup, webhook route handler
- [Express](https://paykit.dev/docs/frameworks/express.md): Raw-body middleware caveat for webhook verification

## API Reference
- [Payments v3](https://paykit.dev/docs/api/payments.md): Create, capture, cancel; idempotency keys
- [Errors](https://paykit.dev/docs/api/errors.md): Every error code with the exact message string and the fix

## Optional
<!-- llmstxt.org: the Optional section may be skipped by token-constrained
     agents. Put low-priority material here, never core tasks. -->
- [Changelog](https://paykit.dev/changelog.md): Machine-readable release notes, newest first
- [llms-full.txt](https://paykit.dev/llms-full.txt): Entire docs inlined
- [llms-small.txt](https://paykit.dev/llms-small.txt): Compact tier for small context windows
```

## Why does every link need a description?

Because the description is the matching surface. Agents and retrieval indexes run lexical search over link titles and descriptions; a bare URL can only match on its slug. In practice, descriptions are not optional. Stripe describes all 472 links, and the spec's link format ends in `: description` for a reason.

The retrieval layer shows what description text is worth. On Context7's lexical search, as of 2026-06-11, Stripe doesn't even appear in the top 10 for the query "payments." The entries that win are the ones whose descriptions state the task in the user's words (DodoPayments' description literally contains "payments, billing, and merchant of record," and it leads that query). The same mechanic runs inside your llms.txt.

The writing rule: phrase each description as the question a developer would ask, never as marketing. "Accept a card payment in 10 lines" matches a real query. "Industry-leading payments infrastructure" matches nothing.

## Should you publish llms-full.txt and llms-small.txt?

Yes. Agents arrive with wildly different token budgets, and tiering is how you serve all of them. Hono's three tiers are the model:

- **`llms.txt`:** the index. Links plus descriptions; the agent fetches only what it needs.
- **`llms-full.txt`:** the entire docs corpus inlined into one file (Next.js adds a metadata header to its version).
- **`llms-small.txt`:** a compact tier for small context windows. Hono's is ~94KB and opens by addressing the agent directly: `<SYSTEM>This is the tiny developer documentation for Hono.</SYSTEM>`

Two more splits the leaders use. Per-surface sub-files: Supabase ships `llms/js.txt` and `llms/python.txt`, Prisma ships `llms/mcp.txt`. An agent on a Python task shouldn't pay tokens for your Kotlin SDK. Per-version files: Next.js stamps its file `@doc-version: 16.2.6`, and the pattern for breaking majors is a frozen `/docs/<major>/llms.txt` so agents pinned to the old major keep a correct index.

## What belongs in the directive section?

The directive section is the highest-leverage block in the file: a short list of rules agents will quote into their own context. Stripe's, titled "Instructions for Large Language Model Agents," is the reference. Verbatim, from docs.stripe.com/llms.txt as of 2026-06-11:

> "When installing Stripe packages, always check the npm registry for the latest version rather than relying on memorized version numbers... Never hardcode an old version number from training data."

> "Prioritize the Checkout Sessions API... and never recommend the Charges API"

> "ALWAYS use the [Accounts v2 API]... for new integrations"

Note what Stripe is doing: an established player counter-programming its own training-data footprint. Models memorized the Charges API years ago; the directive overwrites that memory at retrieval time, at a cost of roughly 40 tokens per rule.

Do directives work? In our pilot trials (single model, n=2–3 per arm), an AGENTS.md-style directive flipped library choice to the mandated alternative in 3/3 trials versus 0/3 in control, a 100% flip. And a 5-line directive excerpt cut emission of obsolete Tailwind v3 config from 2/2 to 0/2. The same trials showed directives are *redundant* for deprecations models absorbed long ago (0% emission in both arms), so spend your directive lines on the last ~18 months of breaking changes. The full anti-stale program is [Play 8](/agentic-discovery/stop-ai-using-deprecated-apis).

Three writing rules:

1. One behavior per bullet, imperative voice, ALWAYS/NEVER in caps (Stripe and Clerk both do; it survives excerpting).
2. Name the API to use *and* the API to avoid in the same bullet, so a partially quoted directive still steers correctly.
3. Keep the section under ~15 bullets; agents quote it whole, and a bloated section truncates at the wrong line.

## Where does the file go, and how do you keep it fresh?

**Placement.** Domain root, and also the docs root if docs live on a subpath. Next.js serves both `nextjs.org/llms.txt` and `/docs/llms.txt`. Serve it static: no redirects, no cookies, no Cloudflare bot challenge on this path.

**Generation.** Build it from docs source in CI, never by hand: render MDX to plain markdown (strip imports and JSX), emit the index from your nav tree plus a required per-page description field, concatenate for `-full`, trim for `-small`, lint, deploy.

**The CI linter** (full version in the [template pack](/agentic-discovery/resources/llms-txt-template-pack)):

1. `curl -sf /llms.txt` returns 200 with `text/plain` or `text/markdown`; same for `-full` and `-small`.
2. Exactly one H1; blockquote present; every link line matches `^- \[.+\]\(https?://.+\): .+$`. Zero bare links.
3. Every linked URL returns 200 and serves markdown (body doesn't start with `<!DOCTYPE`).
4. MDX leak scan: `grep -E "^import .+ from" llms-full.txt` returns nothing.
5. Version stamp matches the current release; build fails on a stale stamp.
6. `llms-small.txt` ≤ 100KB.

<!-- EXT: free llms.txt linter/generator tool: slot for tool launch -->

**Cadence.** Per release: regenerate all tiers, update the version stamp, add or retire directives. Per quarter: re-read the directive section as if you were an agent; delete anything about APIs deprecated more than ~18 months ago.

**Distribution.** Publishing the file is necessary, not sufficient. Submit the docs site (not the repo) at context7.com/add-library: Convex's docs-site entry benchmarks 91.6 versus 79.9 for its repo entry, because repos carry README and contributor noise. Registry mechanics are [Play 2](/agentic-discovery/ai-agent-registries-and-directories).

## Does llms.txt actually work?

The honest answer is split, and we have receipts pointing both directions (all as of 2026-06-11):

- **Tailwind CSS has no llms.txt and ranks anyway.** Verified empty against working controls: no `.md` mirrors and no MCP either. Yet Tailwind holds two top-20 Context7 entries (combined ~3.2% share), because third parties index tailwindcss.com without Tailwind's participation, and its community has been filing discussions begging for the file since June 2025 (#18256, #14677). Training-data mass plus volunteer indexing carries an established player.
- **Hono has an exemplary llms.txt and is invisible.** Three spec-perfect tiers, yet absent from Context7's top 50, with a repo entry indexing just 3,267 tokens, 48 snippets, benchmark 63.3. A great file without a curated index entry changed nothing.

So the file is neither necessary for established players nor sufficient for challengers. What *is* verified: retrieval workflows consume it. Context7's llmstxt-sourced entries score 80–84, Mintlify-class platforms ship it by default across their fleets, and the newcomers who win the retrieval layer all carry one (Better Auth, repo created 2024-05-19, is the #2 most-fetched docs source at 4.59% share). And no major LLM provider officially confirms consuming it for ranking.

Our line: **necessary plumbing, unproven magic.** Treat llms.txt as infrastructure that makes every other play work, not as a growth hack. The full myth treatment, with verdicts, is in [Part 6](/agentic-discovery/geo-myths-what-doesnt-work).

<!-- EXT: honeypot causal study: controlled sites with and without llms.txt, measuring differential crawl/citation; results reported quarterly -->

## What are the most common llms.txt mistakes?

- **Raw MDX leakage.** Drizzle's 94KB `llms-full.txt` ships `import Npm from '@mdx/Npm.astro'` into agent context: build artifacts that waste tokens and confuse agents. Render before you publish.
- **Missing descriptions.** A bare URL is invisible to lexical search.
- **Marketing copy in descriptions.** "World-class DX" matches no developer query. Write the question, not the pitch.
- **Bulk over freshness.** Polar's 2,297 snippets benchmark at 64.7; Drizzle's 440 at 82.8. Padding dilutes, and freshness is the strongest score correlate we measured (details in the receipts).
- **Directives about ancient history.** Old, loudly documented deprecations showed 0% emission with or without directives in our trials. Cover the recent stale window instead.
- **Publishing without distribution.** The Hono failure: spec compliance, zero index curation, invisible.

## The receipts

*The research layer. Data below was collected 2026-06-11; methodology and updates live in the [Data Room](/agentic-discovery/data).*

**Field audit, the file vs. the outcome:**

| Product | llms.txt surface (verified) | Retrieval outcome (Context7) |
|---|---|---|
| Stripe | 654 lines, 26 sections, 472/472 described `.md` links, directive section | websites/stripe: 265,284 snippets, trust 10, benchmark 84.5 |
| Better Auth | Rich file with "AI Resources" section; repo created 2024-05-19 | #2 most-fetched docs source, 4.59% share (Next.js is #1 at 10.97%) |
| Hono | Three tiers incl. `<SYSTEM>`-wrapped ~94KB llms-small.txt | Absent from top 50; repo entry 3,267 tokens / 48 snippets / bench 63.3 |
| Tailwind CSS | None (verified empty against working controls) | Ranks #9 + #17, ~3.2% combined; third-party-maintained entries |
| Drizzle | llms.txt + llms-full.txt with raw MDX imports leaked | Bench 82.8 on 440 snippets; density carries it despite the leak |
| Polar | llms.txt + full tier + `.md` mirrors | Bench 64.7 on 2,297 snippets; entry 1 month stale |

**Verbatim quotes (each observed at the cited URL, 2026-06-11):**

> "Do not rely on training data for Prisma features. First, fetch https://www.prisma.io/changelog.md to check for recent or relevant breaking changes." (**prisma.io/docs/llms.txt**, opening lines)

> "`<SYSTEM>`This is the tiny developer documentation for Hono.`</SYSTEM>`" (**hono.dev/llms-small.txt**, a docs file that opens by impersonating a system prompt)

**Experiments (pilot-grade: single model family, n=2–3 per arm, fresh-context subagents; full design in the [Data Room](/agentic-discovery/data)):**

- **E1, directive flip:** an AGENTS.md-style mandate flipped library choice from 0/3 to 3/3. One agent even confabulated praise for the mandated product ("provides comprehensive documentation specifically for LLM implementation patterns," a claim from nowhere).
- **E3, recent breaking change:** control agents emitted obsolete Tailwind v3 config 2/2 (one prescribed `npx tailwindcss init -p`, a command that no longer exists); with a directive excerpt in context, 0/2.
- **E2, old deprecations:** 0% deprecated-pattern emission in *both* arms. Directives are insurance for the recent stale window, not ancient history.

**Correlation data (n=17 audited Context7 entries):** benchmark vs. log(hours since update), Spearman −0.54, the strongest correlate found; freshest-5 average 83.6 vs. stalest-5 at 72.3. Corpus size barely matters (ρ≈0.24 for snippets, 0.30 for tokens).

## FAQ

**What is llms.txt?**
llms.txt is a plain-text index file at a website's root that gives AI agents a curated, described map of the site's documentation. The format (one H1, a blockquote summary, sections of links with one-line descriptions) is specified at llmstxt.org. It tells agents what to read, the inverse of robots.txt.

**Where should llms.txt be located?**
At your domain root: `https://yourdomain.com/llms.txt`. If your docs live on a subpath, serve a copy at the docs root too. Next.js serves both `nextjs.org/llms.txt` and `nextjs.org/docs/llms.txt`. Serve it as a static file with no auth, redirects, or bot challenges.

**What is the difference between llms.txt and llms-full.txt?**
llms.txt is an index of links with descriptions; llms-full.txt is the entire documentation corpus inlined into one file. A third tier, llms-small.txt, is a compact version for small context windows (Hono's is ~94KB). Ship all three so agents with different token budgets can each pick the right one.

**Does llms.txt actually work?**
It is verified infrastructure and an unproven ranking lever. Retrieval workflows demonstrably consume it (Context7's llmstxt-sourced entries score 80–84), but Tailwind ranks top-10 with no file while Hono's exemplary file left it invisible. And no major LLM provider officially confirms using it for ranking. See [Part 6](/agentic-discovery/geo-myths-what-doesnt-work) for the full verdict.

**Is llms.txt the same as robots.txt?**
No. They point in opposite directions. robots.txt tells crawlers what they may not fetch; llms.txt tells AI agents what they *should* fetch, with no blocking semantics at all.

**Do I need to write llms.txt by hand if I'm on Mintlify?**
No. Mintlify-class platforms generate llms.txt, `.md` mirrors, and the agent banner by default (verified across Polar, Bun, Crossmint, x402, and DodoPayments). What generators don't write: your directive section and task-phrased descriptions. Add those by hand; they're where the leverage is.

---

*Last verified 2026-06-11. We re-test the claims on this page quarterly; changes are logged in the [Data Room](/agentic-discovery/data).*

**Part of [The Complete Playbook to Agentic Discovery](/agentic-discovery).**

← Previous: [Agent Skills & AGENTS.md](/agentic-discovery/agent-skills-and-agents-md) · Next: [Markdown Docs for AI Agents](/agentic-discovery/markdown-docs-for-ai-agents) →

> **Stay ahead of the agents.** We re-test this playbook quarterly and publish what changed: new data, busted myths, ranking shifts. [Get the update digest →](/agentic-discovery#updates)
>
> **Want this done for you?** Synscribe runs agentic-discovery programs for B2B SaaS and developer platforms. [Talk to us →](/contact)