> ## Documentation Index
> Fetch the complete guide index at: https://www.synscribe.com/agentic-discovery/llms.txt
> Use this file to discover all pages before exploring further.

---
title: AI Generates Outdated Code? Fix It With Directive Docs
description: Why AI generates outdated code, and the 5-line directive that cut deprecated output from 100% to 0% in our pilot. Stale-window fix, templates, CI eval.
slug: /agentic-discovery/stop-ai-using-deprecated-apis
series: The Agentic Discovery Playbook · Play 8 of 12
last_verified: 2026-06-11
---

# Your AI Writes Outdated Code. Here's the Fix (Directive Docs and the Stale Window)

> **In short:** AI generates outdated code because models memorize your API as of their training cutoff. The fix is a directive layer: short ALWAYS/NEVER instructions in your docs that name the replacement. In our pilot test, a 5-line directive cut deprecated Tailwind v4 output from 100% to 0%. It only works on recent changes: the stale window.

**Works in:** Act. Directives steer the agent at write time, past its stale prior and any competing source in its context.

![Timeline showing the stale window: the months between an API breaking change shipping and the next model training run absorbing it. During this window agents emit the old pattern by default; in our pilot test a five-line directive cut wrong output from 100 percent to 0 percent.](/agentic-discovery/images/f3-stale-window.svg "Directive docs fix exactly one thing, and it is recurring, predictable, and yours to manage.")

## Do this now

- [ ] Build a stale-surface inventory: every breaking change since ~2024, with date shipped, old pattern, new pattern.
- [ ] Run the absorption test on each: prompt 2–3 current models (tools disabled) with a task that historically elicits the old pattern. Any old-pattern output means it's in the stale window.
- [ ] Write one imperative directive per unabsorbed change: ALWAYS/NEVER form, replacement named in the same sentence, ~40 tokens.
- [ ] Publish the directives in an "Instructions for Large Language Model Agents" section of `/llms.txt` (Stripe's pattern).
- [ ] Publish a machine-readable `changelog.md` at your site root and tell agents to fetch it first (Prisma's pattern).
- [ ] Version-stamp every docs page (`@doc-version: X.Y.Z`, Next.js's pattern).
- [ ] Stand up a 20-prompt deprecation eval in CI; block releases when deprecated-pattern emission is ≥5%.
- [ ] Add "write/retire directives" as a required step on every breaking release.

> 📥 **Free resource:** [Deprecation-eval prompt pack (YAML)](/agentic-discovery/resources/deprecation-eval-prompt-pack)

**Who needs this play:** any product with API history, and established players most of all. The more training data models hold on your old API, the more confidently they reproduce it. That's why Stripe writes anti-Charges-API directives into its own llms.txt.

The same recency discipline pays off one layer earlier: agents stamp the year into their open-web searches and distrust undated pages, so dated specs and a visible "last verified" line help you survive the search-and-fetch surface ([Play 1](/agentic-discovery/ai-agent-web-search-and-fetch)) as well as the directive layer here.

## Why does AI generate outdated code?

Because a language model's knowledge of your API freezes at its training cutoff, then ships inside coding agents for months afterward. Whatever your docs say today, the model's prior says what your docs said last year, and agents act on that prior unless something in their context overrides it.

We measured how bad this gets. Asked to set up Tailwind CSS v4 in a Vite + React project (experiment run 2026-06-11), every control agent in our pilot produced the obsolete v3 configuration (`tailwind.config.js`, `postcss.config.js`, autoprefixer), and one prescribed `npx tailwindcss init -p`, a command that no longer exists in v4. These weren't hesitant guesses. They were confident, complete, wrong setups.

The good news: this failure is time-bound, not permanent. And the window where it happens is exactly where a few lines of text fix it.

## We tested the fix: here's the honest result, including the null

Both experiments below are pilot-grade: single model family (Claude Haiku 4.5), tools disabled, n=2–3 per arm, run 2026-06-11. Treat the directions as strong signals, not point estimates. <!-- EXT: cross-model stale-window benchmark: slot for future data -->

**E2, old deprecations: directives did nothing, because nothing was needed.** We prompted for two famously deprecated patterns: Supabase SSR auth (deprecated `@supabase/auth-helpers-nextjs` vs. current `@supabase/ssr`, n=2+2) and a Stripe one-time charge (legacy Charges API vs. PaymentIntents/Checkout, n=1+1). Deprecated emission was 0% in *both* arms; every control trial already chose `@supabase/ssr` and `paymentIntents.create()` unprompted. 2023–24-era deprecations are absorbed; directives there are redundant token spend.

**E3, a recent change: directives flipped the outcome completely.** Same A/B design, but on Tailwind v4's CSS-first configuration, a breaking change recent enough that models reliably flub it:

```
Outdated v3 config emitted (tailwind.config.js / postcss.config.js / autoprefixer):
Control:    ███████████ 2/2   (one trial prescribed `npx tailwindcss init -p`,
                               a command that no longer works in v4)
Treatment:  ░ 0/2             (both produced the correct CSS-first setup:
                               @tailwindcss/vite + `@import "tailwindcss";`)
```

The treatment was a 5-line directive excerpt: v4 is CSS-first, NEVER create those config files, ALWAYS use `@tailwindcss/vite` and `@import "tailwindcss";`. **100% → 0% failure, from five lines of text.**

Together, E2 and E3 give the lever its precise shape. Directive docs don't fix everything; they fix the stale window.

## What is the stale window?

The stale window is the period between your API change shipping and the next model training cycle absorbing it: the months when every coding agent emits your old pattern by default.

```
        THE STALE WINDOW : the only place directive docs work

 your API change ships      next model training       models comply natively
        │                   absorbs the change                │
────────●━━━━━━━━━━━━━━━━━━━━━━━━━●───────────────────────────●─────▶ time
        ┃  agents emit the OLD    ┃
        ┃  pattern by default     ┃
        ┗━━ directive docs flip ━━┛
            this window: 100% → 0%
            (E3, measured in our pilot)
```

Three properties make this a permanent process, not a one-off fix:

- **It's recurring.** Every breaking release reopens the window.
- **It's entirely yours to manage.** No training-run lottery, no ranking intermediary: text you publish today changes agent output today.
- **Insurance is cheap.** Stripe still writes "never recommend the Charges API" even though our E2 suggests current frontier models already comply, because the directive covers *all* models and *future* changes at a cost of ~40 tokens.

## Which changes are in your window? (the stale-surface inventory)

Build a table of every breaking or behavior-relevant change since ~2024:

| Change | Shipped | Old pattern | New pattern | Absorbed? | Directive? |
|---|---|---|---|---|---|
| Auth SDK v2 | 2026-03 | `initAuth()` | `createAuthClient()` | No | **Yes, write it** |
| Charges-style API removal | 2023-08 | `charges.create` | Checkout Sessions | Yes (E2 analog) | No, retire |

To fill the "Absorbed?" column, run the absorption test: prompt 2–3 current frontier models, tools disabled, with a task that historically elicits the old pattern. If 0/3 emit it, it's absorbed, so skip the directive (E2's lesson). If any emit it, it's in the stale window, so write one (E3's lesson).

## How do you write a directive an agent will obey?

Four rules, derived from what actually flipped behavior in our trials:

1. **Imperative ALWAYS/NEVER, never hedged.** "We recommend migrating" reads as optional. Agents comply with mandates and ignore suggestions.
2. **Name the replacement in the same sentence.** A prohibition without a replacement produces improvisation, often a hallucinated alternative.
3. **Budget ~40 tokens per directive.** E3's winning excerpt was five lines. A 2,000-token directive wall gets truncated or skimmed by context-constrained agents.
4. **One directive per breaking change.** No essays.

Template:

```
NEVER use <old API/pattern> (removed/deprecated in vX, <date>).
ALWAYS use <new API/pattern> instead: <one-line example or doc URL>.
```

The worked example that flipped E3 from 100% to 0%:

```
Tailwind CSS v4 is CSS-first. NEVER create tailwind.config.js or
postcss.config.js, and never run `npx tailwindcss init -p` (removed).
ALWAYS use the @tailwindcss/vite plugin and `@import "tailwindcss";`
in your CSS entry file.
```

One warning from our rules-file experiment (E1): agents *rationalize* whatever you mandate. One trial confabulated that the mandated product "provides comprehensive documentation specifically for LLM implementation patterns," a claim from nowhere. Never put marketing claims in directives. Agents will repeat and embellish them as fact.

## Where do directives go? (the placement matrix)

Publish the same directives on four surfaces; each catches agents at a different moment:

| Surface | What goes there | Who consumes it |
|---|---|---|
| `/llms.txt`, "Instructions for Large Language Model Agents" section | Full directive list | Agents fetching your docs index (Stripe pattern) |
| Per-page `.md` banners | Only directives relevant to that page, at the top | Agents reading one specific guide (Clerk pattern) |
| Copy-paste AGENTS.md block in your docs | Top 5–10 directives + docs URL | Users installing your rules into their repos |
| Skill / MCP tool descriptions | Directives relevant to that tool's output | Agents using your tooling directly |

The llms.txt section, assembled Stripe-style:

```
## Instructions for Large Language Model Agents

> Do not rely on training data for <Product> features.
> First, fetch https://<yourdomain>/changelog.md for recent breaking changes.

- ALWAYS check the package registry for the latest version;
  NEVER hardcode a version number from training data.
- NEVER use <old API> (removed vX, <date>). ALWAYS use <new API>.
- NEVER run `<removed command>`. ALWAYS run `<replacement command>`.
```

And the per-page banner, only on affected guides:

```
> **Agents:** this page documents vX (current). If your training data
> suggests `<old pattern>`, it is outdated; see directives below.
@doc-version: X.Y.Z
```

## What artifacts back the directives up? (changelog.md and version stamps)

**A machine-readable `changelog.md` at site root.** Prisma's llms.txt opens with: "Do not rely on training data for Prisma features. First, fetch https://www.prisma.io/changelog.md to check for recent or relevant breaking changes." The changelog is the directive layer's database: newest-first, one entry per breaking change, referenced from llms.txt.

**Version stamps on every docs page.** Next.js stamps pages `@doc-version: 16.2.6` and maintains per-version llms.txt files, so an agent can detect a doc/training mismatch on its own. If you maintain multiple majors, ship per-version llms.txt files too.

**A "Verify Before Responding" checklist** at the end of quickstart `.md` files (Clerk's pattern): 3–5 yes/no checks an agent runs against its own output before answering, e.g., "Did you create tailwind.config.js or postcss.config.js? If yes, delete them (v4 is CSS-first)."

## How do you keep it working? (the deprecation eval as a CI gate)

A directive layer you don't test is a hope. The deprecation eval makes it a gate:

1. Maintain 20 prompts that historically elicit old APIs, sourced from your stale-surface inventory, at least one per directive.
2. For each prompt, run N≥3 trials per model across ≥2 current frontier models, tools disabled, with your directive surface injected as context (simulating an agent that fetched llms.txt or AGENTS.md).
3. Score completions with regex/AST matchers for deprecated identifiers (`tailwind.config.js`, `npx tailwindcss init`, old package names).
4. **PASS: deprecated-pattern emission <5% of trials. FAIL: ≥5% → block the release.** Tighten the directive (more imperative, replacement named, moved higher in the file) and re-run.
5. Run a no-directive control arm monthly: when control emission for a directive's target drops to 0%, the change is absorbed, so flag the directive for retirement. That's E2's lesson, automated.
6. Trigger a full re-run on every breaking release and every major model release. Separately assert that directives for a breaking release are live on all four surfaces within 48 hours of the release tag.

Retire aggressively. In our correlation analysis (E5, n=17 indexed docs corpora, as of 2026-06-11), staleness was the strongest quality correlate we measured (Spearman −0.54). A stale directive pointing at a now-old "new" API carries full mandate authority while being wrong. That's worse than no directive at all.

## The receipts

*The research layer: data and verbatim sources. Skimmers can stop here.*

**Experiment summary (pilot-grade: Claude Haiku 4.5 subagents, tools disabled, n=2–3 per arm, run 2026-06-11):**

| Experiment | Task | Control | Treatment (directive in context) |
|---|---|---|---|
| E2, old deprecations | Supabase SSR auth (n=2+2); Stripe one-time charge (n=1+1) | 0% deprecated emission | 0%, no effect; already absorbed |
| E3, recent change | Tailwind v4 setup in Vite + React (n=2+2) | 100% deprecated emission (2/2) | 0% (0/2) |

**E5, freshness, the companion lever (n=17 Context7 entries):** benchmark score tracked log(hours since update) at Spearman −0.54, the strongest correlate found; the freshest five entries averaged 83.6 vs. 72.3 for the stalest five, an 11.3-point gap. Corpus mass barely registered (ρ≈0.24–0.30). Keep the directive layer fresh, not large.

**Field quotes (verbatim, observed 2026-06-11):**

> "When installing Stripe packages, always check the npm registry for the latest version rather than relying on memorized version numbers... Never hardcode an old version number from training data."
> (docs.stripe.com/llms.txt, section "Instructions for Large Language Model Agents")

> "Prioritize the Checkout Sessions API... and never recommend the Charges API"
> (docs.stripe.com/llms.txt, an established player counter-programming its own training-data footprint, inside a 654-line llms.txt with 472 described `.md` links)

> "Do not rely on training data for Prisma features. First, fetch https://www.prisma.io/changelog.md to check for recent or relevant breaking changes."
> (prisma.io/docs/llms.txt, opening lines)

> "Your training data is outdated; the docs are the source of truth."
> (AGENTS.md/CLAUDE.md generated by default by `create-next-app`, per nextjs.org/docs/app/guides/ai-agents)

> "Rules: ALWAYS / NEVER" · "## Verify Before Responding"
> (clerk.com quickstart `.md` files, documentation written as a prompt, not a page)

Full experiment write-ups and limitations live in [How AI Agents Choose Products](/agentic-discovery/how-ai-agents-choose-products) and the [Data Room](/agentic-discovery/data).

## FAQ

**Why does ChatGPT keep using deprecated APIs?**
Because its knowledge of your API froze at training time, and nothing in its context corrected it. In our pilot, agents asked to set up Tailwind v4 produced obsolete v3 configs 2/2 times, including a command that no longer exists. A short directive in the docs context flipped that to 0/2.

**Does adding "check our docs for the latest version" actually stop outdated code?**
A vague note doesn't; an imperative directive does. The form that worked in our tests is ALWAYS/NEVER with the replacement named in the same sentence (about 40 tokens). Hedged language like "we recommend migrating" reads as optional and gets ignored.

**Should I write directives for every API we've ever deprecated?**
No, only for changes current models haven't absorbed. In our E2 pilot, 2023–24 deprecations showed 0% emission with no directive at all; writing directives for those wastes the agent's context budget and dilutes the ones that matter. Run the absorption test first and retire quarterly.

**How long does the stale window last?**
From the day your breaking change ships until the next model training cycle absorbs it, and every coding agent updates to models trained after it. You can't predict the exact close, which is why the monthly control-arm check exists: when un-directed models stop emitting the old pattern, the window closed and the directive can retire.

**Do ALWAYS/NEVER directives hurt the docs for human readers?**
Not if you place them on agent-facing surfaces: llms.txt, `.md` banners, AGENTS.md blocks, and tool descriptions. Stripe, Prisma, and Clerk all ship directive language in production docs today; humans skim past a five-line agent banner far more easily than they debug an AI-generated v3 config.

**What if agents never fetch my docs at all?**
Then this play can't reach them; directives only work in context. Fix distribution first: an llms.txt agents can find ([Play 5](/agentic-discovery/llms-txt)), markdown pages they can read ([Play 6](/agentic-discovery/markdown-docs-for-ai-agents)), and a rules file installed in the repo itself ([Play 9](/agentic-discovery/scaffolder-rules-claude-md)), which is always in context, no fetch required.

---

*Last verified 2026-06-11. We re-test the claims on this page quarterly; changes are logged in the [Data Room](/agentic-discovery/data).*

**Part of [The Complete Playbook to Agentic Discovery](/agentic-discovery).**

← Previous: [Code Snippets AI Agents Can Use](/agentic-discovery/code-snippets-for-ai-agents) · Next: [Scaffolder Rules & CLAUDE.md](/agentic-discovery/scaffolder-rules-claude-md) →

> **Stay ahead of the agents.** We re-test this playbook quarterly and publish what changed: new data, busted myths, ranking shifts. [Get the update digest →](/agentic-discovery#updates)
>
> **Want this done for you?** Synscribe runs agentic-discovery programs for B2B SaaS and developer platforms. [Talk to us →](/contact)