> ## Documentation Index
> Fetch the complete guide index at: https://www.synscribe.com/agentic-discovery/llms.txt
> Use this file to discover all pages before exploring further.

---
title: "GEO Myths: 14 Claims Tested Against Live Agent Data"
description: "GEO myths, tested: 14 echo-chamber claims (llms.txt, schema, directories, ranking #1, self-reported benchmarks) graded busted, unproven, or validated with data."
slug: /agentic-discovery/geo-myths-what-doesnt-work
series: The Agentic Discovery Playbook · Part 6 of 6
last_verified: 2026-06-12
---

# GEO Myths: What Everyone Says To Do (and What Our Data Shows)

> **In short:** Four surprises from testing 14 popular GEO claims: a one-paragraph rules file flipped agent product choice 100% (you're not locked out until the next training run); ~99% of MCP distribution bypasses standalone directories; freshness (not volume, not llms.txt) was the strongest quality correlate (ρ=−0.54); and in instrumented runs, agents opened just ~6% of search results while killing ~48% of the claims they checked, so ranking #1 is not the same as being read.

## Do this now

- [ ] Stop treating llms.txt as a rankings lever; ship it as plumbing and measure retrieval, not hope.
- [ ] Cancel the "more content" plan; rewrite your top snippets to be self-contained instead.
- [ ] Relabel your citation-tracker column from "AI visibility" to "mentions."
- [ ] Submit to the ~4 directories with measured impact; skip the rest, and audit any ClawHub exposure today.
- [ ] Write deprecation directives for the last ~18 months of API changes only, not your 2019 migration.
- [ ] Put a freshness alert on your index entries (more than 7 days stale = re-parse).
- [ ] If you ship an MCP server, spend equal effort on its discovery surfaces.
- [ ] Never hide prompts in your docs. The disclosed version works better and won't end in an incident report.
- [ ] Stop assuming a #1 ranking is enough: make titles/snippets read as a precise, dated answer, and back up every headline claim off your own domain (agents open ~6% of results and kill ~48% of self-reported claims).

GEO advice mostly cites other GEO advice. This page is the only kind of check that breaks that loop: each popular claim, stated fairly, then pressure-tested against our audit of 18 developer products, live Context7 rankings, the directory-landscape study, five controlled pilot experiments, and three instrumented agent research runs that traced every query, fetch, and verdict (all data 2026-06). One honesty note that applies everywhere we cite experiments E1–E5 and the three runs: these are pilot-grade (single model family, n=2–3 per arm; three runs on one agent). They're directional, stated as such, and scheduled for multi-model replication. We attack claims here, never people. Echo-chamber examples are linked by article title and site, fairly summarized.

## How we graded each claim

| Verdict | Meaning |
|---|---|
| ❌ **BUSTED** | The echo chamber says do it; our data says it doesn't work as claimed |
| ⚠️ **UNPROVEN, TRACKING** | Widely asserted, no defensible evidence either way; we publish our test design and report when data lands |
| ✅ **VALIDATED (with conditions)** | Works, but narrower than the echo chamber claims |
| ☠️ **HARMFUL** | Actively damages you |

## The 14 claims, pressure-tested

### Myth 1: Does adding llms.txt boost your AI visibility?

**The claim:** "Add llms.txt and your AI visibility goes up."

**The echo chamber:** advocacy titles like ["What Is LLMs.txt? Plus, Why You Need It On Your Site"](https://aioseo.com/what-is-llms-txt/) (AIOSEO) and generator guides like ["What Is LLMs.txt? Exploring Its Function and How to Generate It?"](https://seomator.com/blog/what-is-llms-txt-how-to-generate-it) (SEOmator) frame it as cheap insurance: optional today, essential tomorrow. The same industry is split, though: see ["LLMs.txt: Why Brands Rely On It and Why It Doesn't Work"](https://seranking.com/blog/llms-txt/) (SE Ranking) and ["LLMs.txt For AI SEO: Is It A Boost Or A Waste Of Time?"](https://www.searchenginejournal.com/llms-txt-for-ai-seo/556576/) (Search Engine Journal).

**Our data, both ways (as of 2026-06-11):** Tailwind has *no* llms.txt and holds two top-20 retrieval slots (~3.2% combined share): an established player's privilege. Hono ships an exemplary three-tier llms.txt and is invisible (absent from the top 50; its repo index entry parsed to just 48 snippets, benchmark 63.3). An uncurated index entry nullified a perfect file. No major LLM provider officially confirms consuming it for ranking. But it *is* verifiably consumed by retrieval workflows: Context7's llmstxt-sourced entries score 80–84, Cursor ingests it for @docs, and the Mintlify ecosystem ships it by default.

**Verdict: ⚠️ UNPROVEN, TRACKING as a ranking lever · ✅ VALIDATED as infrastructure.** Necessary plumbing, unproven magic. <!-- EXT: honeypot causal study: controlled sites ± llms.txt, differential crawl/citation, quarterly reporting -->

**Do instead:** ship it correctly and curate the index entry it feeds. [Play 5: llms.txt done right](/agentic-discovery/llms-txt).

### Myth 2: Does more content mean more AI visibility?

**The claim:** "Publish more: bigger docs, more pages, more volume."

**The echo chamber:** velocity-first programs rebranded for AI search. ["Content Velocity vs Content Quality for AI Search"](https://surferstack.com/guides/content-velocity-vs-content-quality-for-ai-search-what-actually-drives-citations-in-2026) (Surferstack) claims "brands publishing 3–5 quality articles per week consistently outrank those publishing one 'perfect' piece per month," and [LLM-seeding guides](https://neilpatel.com/blog/llm-seeding/) (Neil Patel) frame breadth of published mentions as the lever.

**Our data:** corpus mass barely correlates with retrieval-quality scores. Across our n=17 audited Context7 entries, log(tokens) and log(snippets) sit at ρ≈0.24–0.30 against benchmark. The pairwise illustration: Drizzle's 440 snippets score 82.8; Polar's 2,297 snippets score 64.7. The small corpus wins by 18 points. The rubric measures density and self-containment (install + imports + code + output, no duplication), not mass.

**Verdict: ❌ BUSTED.**

**Do instead:** engineer fewer, denser, self-contained snippets. [Play 7: code snippets for AI agents](/agentic-discovery/code-snippets-for-ai-agents).

### Myth 3: Are AI citations the same thing as AI visibility?

**The claim:** "Track your AI citations. That's your AI visibility."

**The echo chamber:** an entire tool category now sells this framing: [Otterly.AI's "AI Search Monitoring Tool"](https://otterly.ai/), Ahrefs' ["How to Monitor Brand Mentions in ChatGPT"](https://ahrefs.com/blog/monitor-brand-mentions-chatgpt/), and roundups like ["7 Best Tools to Monitor ChatGPT Brand Mentions in 2026"](https://www.workduo.ai/blog/best-tools-to-monitor-chatgpt-brand-mentions) (WorkDuo) and ["Best AI Visibility Tools for Brand Mention Tracking (2026)"](https://www.therankmasters.com/insights/ai-visibility/ai-brand-mention-tracking-tools) (The Rank Masters).

**Our data:** the tools measure a real channel: mentions in generated *answers*. But agents *choosing* products in code is a separate layer. Published academic work finds LLMs show low consistency between what they recommend in prose and what they use in code, and our pilot E4 flipped selection 2/2 with a single operability fact (MCP server + llms.txt + skills) that no answer-panel tracker observes. The academic finding: ["A Study of LLMs' Preferences for Libraries and Programming Languages"](https://arxiv.org/abs/2503.17181) (arXiv) reports LLMs show very low consistency between the libraries they *recommend* when asked and the libraries they *actually use* when generating code for the same task.

**Verdict: ✅ VALIDATED (with conditions) for the answer channel. ⚠️ The selection half of "visibility" is unmeasured by these tools.** <!-- EXT: mention-vs-selection gap study (Study 6): our own headline data -->

**Do instead:** measure all four layers. [Part 5: measuring agentic visibility](/agentic-discovery/measure-ai-visibility).

### Myth 4: Should you get listed in every AI directory?

**The claim:** "Get listed everywhere: every MCP directory, every AI tools list."

**The echo chamber:** submission listicles like ["MCP Registries in 2026: Where to List Your Server for AI Tool Discovery"](https://roxyapi.com/blogs/mcp-registries-where-to-list-your-server-2026) (RoxyAPI) and ["MCP Server Directories: The Complete List to Get Your Server Found"](https://dynomapper.com/blog/ai/mcp-server-directories/) (DYNO Mapper); even distribution playbooks that get other things right prescribe "be in every directory," naming mcp.so and Glama as registries that matter (e.g. the widely-shared [2026 MCP distribution playbook recap](https://dev.to/toolstem/i-built-the-mcp-server-greg-isenberg-recommends-in-his-2026-distribution-playbook-heres-day-7-3c33)).

**Our data (as of 2026-06-11):** only ~4 placements show measured impact (the official MCP Registry→GitHub→VS Code chain, skills.sh, Anthropic's surfaces, Context7). The clarifying number: Context7's MCP has 1.14M weekly npm installs but just 6.8K visible "uses" on Smithery. Roughly 99% of real distribution bypasses standalone directories. mcp.so, Glama, and mcpmarket list 19.7K–34K servers with zero published usage evidence and no client integration. And the harmful corner: ClawHub's documented malware epidemic (341 confirmed malicious skills distributing an infostealer, with third-party estimates putting 8–20% of that registry as malicious) makes indiscriminate listing a brand-adjacency risk, not just wasted hours.

**Verdict: ❌ BUSTED, with a ☠️ HARMFUL corner (indiscriminate listing next to documented malware).**

**Do instead:** the four Tier-1 runbooks. [Play 2: AI agent registries and directories](/agentic-discovery/ai-agent-registries-and-directories).

### Myth 5: Are you locked out until the next training run?

![Two-by-two map of products by training-data prior and agent retrieval demand. Next.js sits high on both. Better Auth has high retrieval demand despite a weak training prior. Tailwind has a strong prior with no first-party agent surface. Hono and Crossmint have agent surfaces but low retrieval demand. The playbook moves products from the weak-prior half toward high retrieval demand.](/agentic-discovery/images/f6-anomaly-map.svg "The 2×2 that explains the anomalies, and where the playbook moves you.")

**The claim:** "Defaults are baked into the model. New products must wait for the next training cycle."

**The echo chamber:** training-data fatalism runs through brand-recommendation explainers like ["Why Most E-Commerce Brands Are Invisible to ChatGPT: Understanding AI Training Data Gaps"](https://joinhexagon.com/blogs/why-most-e-commerce-brands-are-invisible-to-chatgp-mpnnuk3t-iiaj) (Hexagon), and guides like ["How ChatGPT Decides Which Brands to Recommend"](https://foglift.io/blog/chatgpt-brand-recommendations) (Foglift) that note models "may take months to reflect new brand information through model updates or retraining cycles." The framing is true about weights and wrong about the system.

**Our data:** Better Auth (repo created May 2024, after most current models' effective training saturation for library conventions) is the #2 most-fetched docs source among AI coding agents (4.59% share), behind only Next.js. Experimentally: in E1, a one-paragraph AGENTS.md flipped agent product choice 3/3 (control: the established player 3/3); in E3, a 5-line directive overrode memorized patterns 100%→0%. Context beats weights, today.

**Verdict: ❌ BUSTED.**

**Do instead:** own the context layers that override the prior. [Play 9: scaffolder rules and CLAUDE.md](/agentic-discovery/scaffolder-rules-claude-md), fed by [Play 5's retrieval surface](/agentic-discovery/llms-txt).

### Myth 6: Will AI always write outdated code for your API?

**The claim:** two opposite ones. "AI will always emit outdated code, nothing you can do" and "just add a 'check our docs' note and it's solved."

**The echo chamber:** the fatalist half lives in the same training-data framing as Myth 5 (months-to-retrain narratives); the "solved" half is implicit in docs-context tool marketing, which pitches up-to-date retrieval as wholesale insurance against outdated output. Neither names the boundary condition our experiments found.

**Our data:** both are wrong, and the truth is narrow. In E3, directive docs fixed a *recent* breaking change completely: agents setting up Tailwind v4 emitted obsolete v3 config 2/2 without docs and 0/2 with a short directive excerpt (100%→0%). In E2, the same treatment did nothing for *old*, loudly documented deprecations (Supabase auth-helpers, Stripe Charges API): emission was already 0% in both arms, because models internalized those fixes. Directives fix exactly the stale window: the months between your API change and the next training cycle absorbing it.

**Verdict: ✅ VALIDATED, narrowly.** Write directives for roughly the last 18 months of changes, not your 2019 migration.

**Do instead:** [Play 8: stop AI using deprecated APIs](/agentic-discovery/stop-ai-using-deprecated-apis).

### Myth 7: Is schema markup the key to AI search?

**The claim:** "Schema markup is the key to AI search optimization."

**The echo chamber:** guides like ["Schema Markup for GEO SEO | AI-Friendly Structured Data"](https://www.getpassionfruit.com/blog/ai-friendly-schema-markup-structured-data-strategies-for-better-geo-visibility) (Passionfruit), ["Schema Markup & Structured Data Best Practices for GEO in AI Search (2025)"](https://geneo.app/blog/schema-markup-structured-data-best-practices-geo-ai-search-2025/) (Geneo), and ["The Ultimate Schema Markup Guide For GEO, AEO And AI Overviews"](https://201creative.com/structured-data-ai-search/) (201 Creative) present structured data as foundational to AI visibility. Even within that industry, voices urge caution: ["How schema markup fits into AI search, without the hype"](https://searchengineland.com/schema-markup-ai-search-no-hype-472339) (Search Engine Land) and ["Schema vs. No Schema: Does Structured Data Matter for AI Search?"](https://www.evertune.ai/resources/insights-on-ai/schema-vs-no-schema-does-structured-data-matter-for-ai-search) (Evertune).

**Our data:** none of the 18 audited agent-retrieval winners rely on schema markup for agent retrieval. The observed consumption pattern is different: agents fetch llms.txt and `.md` mirrors and grep text. Schema plausibly matters for AI Overviews and answer engines, a different channel from coding agents, with different evidence standards.

**Verdict: ⚠️ UNPROVEN, TRACKING for agent selection.** Scoped claim, honest split. <!-- EXT: AI-Overviews-channel test design -->

**Do instead:** serve the formats agents demonstrably consume. [Play 6: markdown docs for AI agents](/agentic-discovery/markdown-docs-for-ai-agents).

### Myth 8: Do adding statistics, quotes, and citations boost GEO?

**The claim:** "Add statistics, quotes, and citations to your content for ~30–40% more AI visibility."

**The echo chamber:** this is the most-recycled finding in GEO. It originates in the academic paper ["GEO: Generative Engine Optimization"](https://arxiv.org/abs/2311.09735) (arXiv, KDD 2024) and gets repeated as universal advice in guides like ["Generative Engine Optimization: A Practical Guide"](https://www.semrush.com/blog/generative-engine-optimization/) (Semrush), ["Generative engine optimization: What we know so far"](https://blog.hubspot.com/marketing/generative-engine-optimization) (HubSpot), and ["Generative Engine Optimization (GEO): What It Is and Why It Matters"](https://www.thehoth.com/blog/generative-engine-optimization/) (The HOTH).

**Our data:** the paper's findings concern *citation in generated answers*, and they are endlessly repeated as if they covered agent product selection. Nobody has replicated them at the selection layer. We found no mechanism in our audits by which prose statistics would move a coding agent's choice. The defensible cousin we did verify: description/task-phrase matching in registries. DodoPayments owns the lexical "payments" query on Context7 (benchmark 83) while Stripe is absent from that query's top 10, because the winning description literally states the task.

**Verdict: ⚠️ UNPROVEN, TRACKING for agents.** <!-- EXT: replication design at the selection layer -->

**Do instead:** engineer your registry descriptions around real task phrases. [Play 2](/agentic-discovery/ai-agent-registries-and-directories).

### Myth 9: Can hidden prompts in your docs make AI recommend you?

**The claim:** "Stuff hidden instructions in your pages so agents recommend you."

**The echo chamber:** this one graduated from gray-hat threads to documented attack class. [Microsoft's security team formally named it "AI Recommendation Poisoning"](https://www.microsoft.com/en-us/security/blog/2026/02/10/ai-recommendation-poisoning/) (Feb 2026), documenting commercial tooling marketed as an "SEO growth hack for LLMs." Search Engine Land's verdict is in its headline: ["Hidden prompt injection: The black hat trick AI outgrew"](https://searchengineland.com/hidden-prompt-injection-black-hat-trick-ai-outgrew-462331); the academic treatment is ["Adversarial Search Engine Optimization for Large Language Models"](https://arxiv.org/html/2406.18382v1) (arXiv).

**Our data:** three independent reasons this damages you. Injection screening already exists at the index layer: Context7 scans snippets with an injection-detection model *before storage*, so the payload gets blocked and your source flagged. It's trivially detectable by anyone reading your public pages, and reputationally fatal when found. And the legitimate version of the same mechanism (the environment layer, disclosed and opt-out-able, see [Play 9](/agentic-discovery/scaffolder-rules-claude-md)) achieved a 100% selection flip in our pilots without hiding anything.

**Verdict: ☠️ HARMFUL.**

**Do instead:** the disclosed environment play. [Play 9: scaffolder rules, with the ethics box](/agentic-discovery/scaffolder-rules-claude-md).

### Myth 10: Does shipping an MCP server guarantee growth?

**The claim:** "Ship an MCP server and growth follows."

**The echo chamber:** "MCP servers as your sales team" is literally the #1 item in a widely-shared 2026 distribution playbook (see the [builder's recap on DEV](https://dev.to/toolstem/i-built-the-mcp-server-greg-isenberg-recommends-in-his-2026-distribution-playbook-heres-day-7-3c33)), alongside monetization takes like ["MCP Servers Are the New SaaS"](https://dev.to/krisying/mcp-servers-are-the-new-saas-how-im-monetizing-ai-tool-integrations-in-2026-2e9e) (DEV). To its credit, even that playbook concedes "distribution is the work, building was the easy part."

**Our data:** agent-operability genuinely flips selection, *when the agent knows about it*. In pilot E4, surfacing the fact that one vendor shipped MCP + llms.txt + skills flipped choice 2/2. The counterexample is Crossmint: a complete MCP + agent-docs surface with no findable entry in the dominant retrieval index. It's a full build that agents never discover. MCP without discovery distribution is a feature, not a channel.

**Verdict: ✅ VALIDATED (with conditions).** The condition is discovery: registries, `.well-known` manifests, llms.txt mentions.

**Do instead:** ship it *and* distribute it. [Play 3: MCP server distribution](/agentic-discovery/mcp-server-distribution).

### Myth 11: Can you publish llms.txt once and forget it?

**The claim:** "Set and forget. Publish the file, done."

**The echo chamber:** implicit in the one-shot "generate your llms.txt" tooling and checklists already cited under Myth 1 (the AIOSEO and SEOmator generator guides). The file is framed as a setup task, not an operated surface. Ironically, the freshness-adjacent advice the same industry gives for *content* (["Content Freshness: Why Regular Updates Improve Visibility"](https://www.quattr.com/blog/content-freshness), Quattr) rarely gets applied to the agent surface itself.

**Our data:** freshness was the strongest quality-score correlate we measured: Spearman ρ=−0.54 between hours-since-update and benchmark score (n=17), with an 11.3-point gap between the freshest five entries (avg 83.6) and the stalest five (72.3). And demand decays just as fast: openclaw lost 50% of its retrieval share in 30 days while still ranked #10. This layer is a flow, not a stock. It decays in weeks.

**Verdict: ❌ BUSTED.**

**Do instead:** re-parse on every docs deploy and run the weekly tracker. [Play 2](/agentic-discovery/ai-agent-registries-and-directories) + [Part 5](/agentic-discovery/measure-ai-visibility).

### Myth 12: Is ranking #1 in AI search enough to get picked?

**The claim:** "Rank at the top of the results an agent searches and it will choose you."

**The echo chamber:** the reflex imported wholesale from SEO ("win the SERP and you win the customer"), restated for agents as "get to position one in AI search." It treats the ranked list as the decision, exactly as a human's ten blue links once were. <!-- EXT: collect 2–3 representative "rank #1 for AI search" echo-chamber links during production -->

**Our data:** ranking is not opening. In our instrumented high-stakes run, agents surfaced **215 distinct domains across their searches but actually fetched only ~13, a ~6% open rate.** The agent reads your snippet, not your page, and the decision to open is driven by title, meta description, domain authority, and apparent freshness. A #1 result that doesn't read like a precise, dated answer to the agent's (long, spec-loaded) query simply never gets opened.

**Verdict: ❌ BUSTED.** Ranking gets you surfaced. Being worth opening gets you read.

**Do instead:** engineer fetch-worthy titles and snippets for the agent's own query. [Play 1: get found in agent web search](/agentic-discovery/ai-agent-web-search-and-fetch).

### Myth 13: Do your published specs and benchmark numbers speak for themselves?

**The claim:** "Put your accuracy figures and feature claims on your site and agents will repeat them."

**The echo chamber:** the "publish your stats and let the AI quote them" advice (a cousin of the GEO statistics folklore in Myth 8) assumes an agent ingests a vendor page's self-reported numbers at face value. <!-- EXT: collect 2–3 representative "publish your benchmarks for AI" echo-chamber links during production -->

**Our data:** agents grade source type and try to refute claims before believing them. In the instrumented email run, the agent extracted 87 claims, escalated 25 to a **three-voter adversarial panel ("Be SKEPTICAL… ≥2/3 refutations kill it"), and killed 12, a 48% kill rate.** One widely-repeated vendor claim was refuted 3–0. Across runs it explicitly down-weighted vendor-authored pages (noting one "best API of 2026" listicle "literally ranks [its own publisher] #1, treat specific accuracy numbers as marketing"). What survives is what a **primary or third-party source** also states. A self-scored number isn't an asset here. It's a liability that can discount your whole page.

**Verdict: ❌ BUSTED.** Self-published specifics don't survive. Backed-up ones do. <!-- EXT: multi-agent verification-rate study: slot for future data -->

**Do instead:** back up every headline claim off your own domain ([Play 1](/agentic-discovery/ai-agent-web-search-and-fetch)), and feed the trusted third-party surface with [Play 11: public evals](/agentic-discovery/ai-evals-and-leaderboards).

### Myth 14: Does winning AI search in one category carry to the next?

**The claim:** "Build agent-search authority once and it compounds across everything you sell."

**The echo chamber:** "domain authority" thinking transplanted from SEO: the belief that a strong, trusted domain wins the agent's consideration set wherever you expand. <!-- EXT: collect 2–3 representative "domain authority for AI" echo-chamber links during production -->

**Our data:** there is no observed transfer. Across all three instrumented runs the agents touched **596 distinct domains, and not one *product* domain appeared in more than one category.** The only repeats were a PR wire and a dev blog. Agents assemble the consideration set per category, query by query, from whoever best answers that specific question. Whatever you've earned in one category doesn't pre-load you into the next.

**Verdict: ❌ BUSTED (n=3 runs, directional).** Earn each category on its own.

**Do instead:** treat each category as a fresh surface to win, and measure per-category. [Part 5](/agentic-discovery/measure-ai-visibility) + [Play 1](/agentic-discovery/ai-agent-web-search-and-fetch).

## Where do these verdicts get updated?

Every claim above lives as a tracked row in the **Pressure-Test Ledger** (claim · verdict · evidence link · last tested · next test date) in [the Data Room](/agentic-discovery/data). The ⚠️ UNPROVEN items are not rhetorical: each has a designed test (the llms.txt honeypot study, the mention-vs-selection gap study, the selection-layer GEO replication), and verdicts graduate to ❌ or ✅ as data lands. We update the ledger in public, quarterly.

## The receipts

*Research layer: the evidence behind the verdicts, with scale stated.*

**Experiments (run 2026-06-11, Claude Haiku 4.5 subagents, tools disabled, n=2–3 per arm, pilot-grade):**

| Experiment | Design | Result |
|---|---|---|
| E1: rules-file flip | "Add auth" task ± AGENTS.md mandating a non-default product | Control: established player 3/3 · Treatment: mandated product 3/3 |
| E2: old deprecations | Directive docs vs 2023–24-era deprecated APIs | 0% deprecated emission in *both* arms (directives redundant) |
| E3: recent change | Directive docs vs Tailwind v4 setup | Obsolete config: control 2/2 → treatment 0/2 |
| E4: operability fact | Email-API choice ± "vendor X ships MCP/llms.txt/skills" (synthetic fact) | Control: Postmark 2/2 · Treatment: flipped 2/2 |
| E5: correlations | n=17 Context7 entries | Freshness ρ=−0.54 (strongest); tokens/snippets ρ≈0.24–0.30 |

**Three instrumented agent research runs (Claude Code, opus-4.x, 2026, Birdseye traces, pilot-grade, n=3 runs on one agent):** same class of "what should I build with?" question drew 6 / 116 / 344 web operations (a 57× spread by technique: inline · 5 sub-agents · 101-sub-agent workflow). Highest-volume run: 215 domains surfaced, ~13 fetched (~6% open rate); 87 claims extracted, 25 adversarially verified, 12 killed (48%). Across all three runs: 596 distinct domains, zero product-domain overlap between categories. These trace the open-web search/fetch surface (Myths 12–14). The Context7 numbers below trace the structured retrieval surface.

**Observational receipts (all 2026-06-11):** Better Auth at #2 with 4.59% share (repo created 2024-05-19, GitHub-API-verified) · Tailwind top-10 with zero first-party agent surface (no llms.txt, no `.md` mirrors, no MCP, confirmed against working controls) · Hono absent from top 50 despite pioneering tiered llms.txt (repo entry: 3,267 tokens, 48 snippets, benchmark 63.3) · Drizzle 440 snippets/82.8 vs Polar 2,297/64.7 · DodoPayments wins lexical "payments" (benchmark 83); Stripe absent from that query's top 10 while /websites/stripe holds 265,284 snippets, benchmark 84.5 under its own name · Crossmint: full agent surface, no findable Context7 entry (three slug guesses + search; UNVERIFIED whether one exists under another name) · Smithery "uses" for Context7: 6.8K vs 1,136,447 weekly npm installs (99.4% bypass) · mcp.so/Glama/mcpmarket: 19.7K–34K listed servers, no usage data, no client integration · ClawHub: 341 confirmed malicious skills (The Hacker News, Feb 2026); Snyk counted 1,467; estimates 8–20% of registry malicious · openclaw −50% retrieval share in 30 days · freshest-5 benchmark avg 83.6 vs stalest-5 72.3.

**Limitations, stated plainly:** pilot n; single model family; Context7 is one vendor-owned index skewed toward terminal agents (~73%) on the TypeScript/React stack; retrieval ≠ selection (we therefore never grade a myth on retrieval data alone where the claim concerns selection); E5 is correlational on a non-random n=17. Full bias register in the [Data Room](/agentic-discovery/data).

## FAQ

**Does llms.txt help SEO?**
There is no defensible evidence that llms.txt improves rankings in AI search or traditional SEO, and no major LLM provider officially confirms consuming it for ranking. It is validated as infrastructure: real retrieval workflows (Context7 sourcing, Cursor @docs, the Mintlify ecosystem) demonstrably consume it. Ship it as plumbing, not as magic.

**Does publishing more content improve AI visibility?**
No. Corpus mass barely correlates with agent retrieval-quality scores (ρ≈0.24–0.30 in our n=17 sample). A 440-snippet corpus outscored a 2,297-snippet corpus by 18 points; density and self-containment win.

**Are AI citation trackers worth it?**
They are worth it for the answer channel (mentions in ChatGPT/Perplexity-style responses) and misleading as a measure of total AI visibility. Agents choosing products in code respond to operability facts these tools don't observe; in our pilot, one such fact flipped selection 2/2.

**Do AI directories matter?**
About four placements show measured impact: the official MCP Registry (which feeds GitHub and VS Code), skills.sh, Anthropic's surfaces, and Context7. Roughly 99% of measured MCP distribution bypasses standalone directories, and one large skill registry carried 341 confirmed malicious listings.

**Can a new product become an AI default before the next training run?**
Yes. A two-year-old library is the #2 most-fetched docs source among AI coding agents, and in our pilots a one-paragraph context file flipped agent product choice 100% of the time. Retrieval and environment context override training priors today.

**How do I stop AI from generating outdated code for my API?**
Write imperative directives (NEVER X / ALWAYS Y) covering your recent changes, roughly the last 18 months, in llms.txt and your docs. In our pilot this took deprecated-pattern emission from 100% to 0% on a recent breaking change; for years-old deprecations, models have usually already absorbed the fix.

**Is schema markup good for AI search?**
For coding agents, it's unproven: none of the 18 agent-retrieval winners we audited rely on schema for agent retrieval. Agents fetch llms.txt and markdown and grep text. Schema plausibly matters for AI Overviews and answer engines, which are a different channel.

**Do statistics and quotes improve GEO?**
The famous findings behind that advice measured citation in generated answers, not agent product selection, and no one has replicated them at the selection layer. The verified analog for agents is description/task-phrase matching in registries.

**Can hidden prompts in my docs make AI recommend my product?**
No, and they can hurt you. Major indexes screen snippets for injection before storage, the trick is publicly detectable, and the disclosed alternative (rules files with opt-out) achieved a 100% selection flip in our pilots without the risk.

**Does shipping an MCP server increase adoption?**
Only when agents can discover it. Operability flipped selection 2/2 in our pilot when surfaced in context, while a product with a complete MCP surface but no registry presence was effectively invisible. Pair the server with registry listings and `.well-known` manifests.

**How often should llms.txt and docs indexes be updated?**
On every docs deploy, with a re-parse trigger. Alert if your index entry shows more than 7 days since update. Freshness was the strongest quality-score correlate we measured (ρ=−0.54), worth 11.3 benchmark points between freshest and stalest.

**Is ranking #1 in AI search enough for an agent to pick me?**
No. In our instrumented runs, agents opened only ~6% of the domains their searches surfaced. They read the snippet and decide whether to open. A top result that doesn't read as a precise, dated answer to the agent's query never gets fetched, and a claim only your own site asserts is likely to be killed in verification (~48% of checked claims were). Ranking is necessary, not sufficient.

**Do AI agents trust vendor benchmark numbers?**
Generally no. Many agents extract claims and try to refute them against primary sources before believing them; in one run a three-voter skeptic panel killed 48% of the claims it checked, hitting self-reported vendor numbers hardest. What survives is backed up by a standards doc, regulator, or independent benchmark. So get your hard claims onto third-party surfaces rather than relying on your own marketing pages.

---

*Last verified 2026-06-12. We re-test the claims on this page quarterly. Changes are logged in the [Data Room](/agentic-discovery/data).*

**Part of [The Complete Playbook to Agentic Discovery](/agentic-discovery).**

← Previous: [Measuring Agentic Visibility](/agentic-discovery/measure-ai-visibility) · Next: [Case Studies: Ten Company Teardowns](/agentic-discovery/case-studies) →

> **Stay ahead of the agents.** We re-test this playbook quarterly and publish what changed: new data, busted myths, ranking shifts. [Get the update digest →](/agentic-discovery#updates)
>
> **Want this done for you?** Synscribe runs agentic-discovery programs for B2B SaaS and developer platforms. [Talk to us →](/contact)