> ## Documentation Index
> Fetch the complete guide index at: https://www.synscribe.com/agentic-discovery/llms.txt
> Use this file to discover all pages before exploring further.

---
title: How Stripe Fights Its Own Training-Data Ghost
description: Stripe case study — a 265,284-snippet agent index, a 654-line llms.txt that bans its own Charges API, and a +32% 30-day retrieval trend.
slug: /agentic-discovery/case-studies/stripe
series: The Agentic Discovery Playbook — Case Study
last_verified: 2026-06-11
---

# How Stripe Fights Its Own Training-Data Ghost

> **The lesson:** An incumbent's agentic-discovery problem isn't absence — it's being confidently wrong in the model's memory. Stripe's answer is to route agents onto surfaces it controls (a 265,284-snippet docs index, a remote MCP) and use those surfaces to issue direct orders: never recommend the Charges API, never trust memorized version numbers.

## At a glance

| Category | Number (as of 2026-06-11) |
|---|---|
| Context7 docs-site entry (/websites/stripe) | 265,284 snippets · trust 10 · benchmark 84.5 — the largest corpus in our audit |
| Context7 rank / momentum | #38 of top 50 (0.80% share) · **+32% over 30 days** |
| Own-repo entry (stripe/stripe-node) | 23,259 tokens · 130–207 snippets · benchmark 79.6–81 |
| docs.stripe.com/llms.txt | 654 lines · 472 described `.md` links · 26 sections |
| Remote MCP (mcp.stripe.com) | ~25 tools, incl. `search_stripe_documentation` |

## What they built

Every model that writes payments code has years of Stripe in its training data — and a lot of it is wrong now. Models memorized the Charges API, old SDK version numbers, the pre-v2 Accounts API. Stripe's agent surface is engineered to overwrite that ghost, not to introduce the product.

**Layer 1 — the index that routes.** The docs llms.txt is 654 lines: 472 links to `.md` pages, each with a one-line description, organized into 26 sections. This is the file an agent fetches before deciding which page to read — and Stripe wrote a description for every door.

**Layer 2 — the instructions.** The same file contains a section literally titled "Instructions for Large Language Model Agents." It is not a sitemap; it's a prompt:

> "When installing Stripe packages, always check the npm registry for the latest version rather than relying on memorized version numbers... Never hardcode an old version number from training data."
> — docs.stripe.com/llms.txt, observed 2026-06-11

> "Prioritize the Checkout Sessions API... and never recommend the Charges API"
> — docs.stripe.com/llms.txt — an incumbent counter-programming its own legacy API

> "ALWAYS use the Accounts v2 API"
> — docs.stripe.com/llms.txt

**Layer 3 — the tools and the workflow docs.** A remote MCP server at mcp.stripe.com exposes roughly 25 tools, including `search_stripe_documentation` — so an agent that connects never needs to rely on its prior at all. Dedicated agent-workflow pages (`/agents.md`, `/building-with-ai.md`) round out the surface.

Read together, the three layers are one machine with one job:

1. The llms.txt index makes Stripe's *current* docs the cheapest thing for an agent to fetch.
2. The directive section makes the fetch overwrite the memorized, deprecated version of Stripe.
3. The MCP makes both unnecessary — the agent queries live docs instead of remembering anything.

Each layer is insurance against the failure mode of the one above it. An agent that skips the index may still hit a directive on a `.md` page; an agent that reads nothing can still be handed `search_stripe_documentation`.

One deliberate-looking asymmetry: Stripe's own-repo Context7 entries are small (stripe-node parsed to 130–207 snippets), while the docs-site entry is the largest corpus we found anywhere — 265,284 snippets at trust 10, benchmark 84.5. The center of gravity is the curated docs site, not the SDK repos. That matches our broader finding that docs-site entries outscore repo entries (Convex's site entry beats its own repo by 11.7 benchmark points).

## The receipts

**The numbers (Context7, collected 2026-06-11):**

| Entry | Tokens / snippets | Trust | Benchmark | Note |
|---|---|---|---|---|
| /websites/stripe | 265,284 snippets | 10 | 84.5 | Largest docs corpus in our entire audit; rank #38; +32%/30d |
| stripe/stripe-node | 23,259 tokens · 130–207 snippets | 8.9 | 79.6–81 | Updated 1 week prior |

The stripe-node snippet count read **130 on the library page and 207 in the search API on the same day**. We use this discrepancy as the canonical snapshot-noise example across this playbook: these metrics are recomputed continuously, and any single-day value carries roughly ±10% error bars.

**The lexical gap (as of 2026-06-11):** Stripe does not appear in Context7's lexical top 10 for the query "payments" — that query is won by DodoPayments on description text. Stripe's retrieval presence lives under its own name, plus the first-party routing above. An incumbent can afford that; see the caveats below.

**Do the directives actually work?** Our pilot experiments give the honest, two-part answer (both run 2026-06-11, Claude Haiku 4.5 subagents, tools disabled, n=1–2 per arm — pilot-grade):

| Experiment | Target | Control (no directive) | Treatment (directive in context) |
|---|---|---|---|
| E2 — old deprecation | Stripe one-time charge (Charges vs PaymentIntents), n=1+1 | 0% deprecated — `paymentIntents.create()` chosen unprompted | 0% — no effect needed |
| E3 — recent change | Tailwind v4 setup (the stale-window analog), n=2+2 | 100% obsolete output | 0% |

The Charges deprecation is old enough that current frontier models have absorbed it — Stripe's anti-Charges line is, today, ~40 tokens of insurance. But E3 proves the mechanism works exactly where it matters: the stale window between an API change and the next training cycle. Stripe ships breaking changes continuously; the directive section is standing infrastructure for every future one. Full nuance in [Play 8](/agentic-discovery/stop-ai-using-deprecated-apis).

**UNVERIFIED items, flagged as in the research report:** we did not fetch the contents of Stripe's llms-full.txt (referenced, not read), and we did not verify copy-page-as-markdown buttons on docs pages. Neither claim should be quoted from this page.

## What to copy

- [ ] Write an "Instructions for Large Language Model Agents" section into your llms.txt — imperative ALWAYS/NEVER lines with the replacement named in the same sentence — [Play 8](/agentic-discovery/stop-ai-using-deprecated-apis).
- [ ] Give every link in your llms.txt a one-line description; Stripe's 472 described links are what lexical search matches and what agents use to pick a page — [Play 5](/agentic-discovery/llms-txt).
- [ ] Make your docs *site* (not your SDK repo) the canonical retrieval entry, and curate it — [Play 2](/agentic-discovery/ai-agent-registries-and-directories).
- [ ] Mirror every docs page as a stable `.md` URL so the index has something fetchable to point at — [Play 6](/agentic-discovery/markdown-docs-for-ai-agents).
- [ ] Ship a docs-search MCP tool (`search_yourproduct_documentation`) so agents query your current docs instead of their prior — [Play 3](/agentic-discovery/mcp-server-distribution).
- [ ] Add "tell agents to check the package registry for the current version" to your directive set; never let an agent hardcode a memorized version — [Play 8](/agentic-discovery/stop-ai-using-deprecated-apis).

## What NOT to over-copy

- **Stripe starts at trust 10.** Context7's reputation formula rewards org age, stars, and followers. A young product publishing the identical surface will not get the identical scores — that part of Stripe's position is incumbent privilege, not technique.
- **Skipping the lexical "payments" query is a choice only Stripe can afford.** A challenger absent from its category's task query is invisible; Stripe is merely findable elsewhere. Don't copy the gap.
- **The anti-Charges directive is currently redundant for frontier models (E2).** Copy the *practice* — directives for changes inside the stale window — not the specific line. Run the absorption test first.
- **Survivorship and snapshot caveats apply.** We audited Stripe because it wins; products with similar surfaces that failed are undermeasured. All metrics are single-day snapshots with ±10% error bars (the 130-vs-207 discrepancy is the proof), and Context7 skews toward terminal agents on the TypeScript stack.
- **+32%/30d is a flow, not a stock.** openclaw lost 50% of its share in 30 days while still ranked #10. Trends at this layer decay in weeks and must be maintained.

## FAQ

**Why does Stripe tell AI agents not to use its own Charges API?**
Because models memorized it. Years of pre-deprecation Stripe code in training data means an agent's default can be Stripe's legacy pattern; the llms.txt directive ("never recommend the Charges API") overwrites that prior at retrieval time. Our pilot found current frontier models already comply unprompted — the directive is cheap insurance across all models and future changes.

**Is Stripe's llms.txt really an instruction file?**
Yes — 654 lines, mostly a described link index, plus a section titled "Instructions for Large Language Model Agents" containing imperative rules: check npm for latest versions, never hardcode versions from training data, prefer Checkout Sessions, always use Accounts v2. llms.txt has quietly become an instruction channel, not a sitemap.

**Does Stripe rank well in agent retrieval?**
Its docs-site Context7 entry is #38 in the top 50 with a +32% 30-day trend, and it is the largest corpus we found (265,284 snippets, benchmark 84.5, as of 2026-06-11). Notably it does *not* win the lexical "payments" query — Stripe's strategy routes agents to first-party surfaces rather than competing on description text.

---

*Snapshot date 2026-06-11; single-day metrics carry ±10% error bars. Part of [Case Studies](/agentic-discovery/case-studies) · [The Complete Playbook to Agentic Discovery](/agentic-discovery).*

← Previous: [Inside Convex's Eval-Driven Agent Strategy](/agentic-discovery/case-studies/convex) · Next: [The Tailwind Paradox](/agentic-discovery/case-studies/tailwind) →

> **Stay ahead of the agents.** We re-test this playbook quarterly and publish what changed — new data, busted myths, ranking shifts. [Get the update digest →](/agentic-discovery#updates)
>
> **Want this done for you?** Synscribe runs agentic-discovery programs for B2B SaaS and developer platforms. [Talk to us →](/contact)
