Before you start: the three engines have different plumbing
This is the single most important concept on the page, so read it before touching anything. The three big answer engines don't share an index or a crawler — they each draw from a different place. Optimize for the wrong plumbing and you stay invisible on a platform even while you win on another.
| Engine | Where its answers come from | What that means for you |
|---|---|---|
| ChatGPT Search | Runs on Bing's index (one analysis matched ~87% of citations to Bing's top organic results) plus its own OAI-SearchBot crawler | Strong on Google but weak on Bing = invisible in ChatGPT. Verify in Bing Webmaster Tools. |
| Google AI Overviews | Google's existing index, via retrieval + query fan-out over already-indexed, already-trusted pages | If you rank and are snippet-eligible on Google, you're in the running. No special files needed. |
| Perplexity | Leans heavily on Reddit and original data/statistics | Earned presence and proprietary numbers matter most here. |
The takeaway in one line
ChatGPT visibility runs through Bing. Google AI Overviews runs through your existing Google index. Perplexity rewards original data and Reddit presence. One blended "AI strategy" misses this — optimize per engine. For the full strategy behind these moves, see generative engine optimization.
Why does this matter so much? Because AI answers cite only **2–7 domains** — not ten blue links. The game shifts from "rank in the top 10" to "be one of a handful of cited sources." A narrow funnel means being pretty good rarely makes the cut, and being invisible on one engine's plumbing costs you the whole platform.
2–7
domains cited per AI answer (vs. 10 blue links) — the funnel into an AI answer is dramatically narrower
Phase 1 — Open crawl access (do this first)
Everything else is wasted effort if the bots can't reach you. This is the gate people quietly fail most often — usually a security plugin, a WAF rule, or a copy-pasted robots.txt is blocking the very crawlers they want to attract. Work these four steps before any content work.
Step 1 — Audit robots.txt and allow the search bots
Crawler access is now granular and opt-in-able per purpose — you can allow the **search** bots while still blocking **training** bots if you choose. The search bots are the ones that decide whether you show up in answers, so allow those explicitly. Here's who does what:
| Operator | Search bot (allow this) | Training bot (optional) | Live-fetch bot |
|---|---|---|---|
| OpenAI | OAI-SearchBot — ChatGPT search visibility | GPTBot — model training | ChatGPT-User — live user fetches |
| Anthropic | Claude-SearchBot — Claude search | ClaudeBot — model training | Claude-User — live fetches |
| Perplexity | PerplexityBot — search indexing | — | Perplexity-User — live answers |
OpenAI's own words
Opting out of OAI-SearchBot means your site "will not be shown in ChatGPT search answers." That's a direct quote. If that bot is blocked, no amount of great content gets you into ChatGPT's answers. Anthropic's and Perplexity's bots all respect robots.txt (Anthropic also honors Crawl-delay); Perplexity publishes IP ranges as JSON so you can allowlist its bots at the WAF too.
See our AI crawlers reference for exact, copy-pasteable robots.txt directives for every bot above. The verifiable result: open your live robots.txt in a browser and confirm there is no `Disallow: /` under those search-bot user-agents.
Step 2 — Verify server-side rendering
OpenAI's and Anthropic's crawlers **cannot render JavaScript** — they only see the initial HTML the server returns. If your content is client-side rendered (a React/Vue app that fills in the page after load), the bots see an empty shell and you're invisible to them. How to check it yourself: open your page, view source (the raw HTML, not the inspected DOM), and confirm your actual headings, answer text and links are present in that source. If view-source is mostly empty `<div id="root">`, your content isn't reaching the bots.
Step 3 — Submit to Bing Webmaster Tools
This is the ChatGPT on-ramp most people skip. Because ChatGPT Search runs on Bing's index, a site that ranks beautifully on Google but is absent from Bing can be invisible to ChatGPT regardless of its Google position. Verify your site in Bing Webmaster Tools (not just Google Search Console) and submit your sitemap there. The verifiable result: your pages show as indexed in the Bing Webmaster Tools dashboard.
Step 4 — Confirm Google indexing for AI Overviews
Google AI Overviews pull from Google's existing index, so the requirement is the ordinary one: your page must be **indexed and eligible to show with a snippet**. Check it in Google Search Console's URL Inspection tool. Nothing exotic — if it can rank with a snippet, it can be used in an AI Overview.
Not sure which bots your site is blocking, or whether AI engines can actually read your HTML? Run a free AI SEO audit — it checks crawl access, indexability and renderability in about 15 seconds.
Run a free AI SEO auditPhase 2 — Structure pages for extraction
Now that the bots can read you, make your content trivially easy to lift as a clean, self-contained answer. A trusted, crawlable page still won't get quoted if the model has to guess at your meaning. These four steps decide extractability — the same principles behind answer engine optimization.
Step 5 — Front-load a 40–60 word answer
On every priority page, open with a direct 40–60 word answer to the page's primary question — one a model could lift verbatim and be correct. The first ~200 words should fully answer the query, not build up to it. You're handing the engine a ready-made, accurate quote so it doesn't have to assemble one (and risk paraphrasing you wrong). The intro at the top of this very page is built that way on purpose.
What a good answer block looks like
Question-shaped heading, then 40–60 words that answer it completely with no "read on to find out." Concrete, specific, self-contained. Below it, expand with detail, examples, statistics and sources. Front-loading the answer is the cheapest, highest-leverage structural move you can make.
Step 6 — Use a clean heading hierarchy: one question per heading
Use real semantic HTML — a clean H2/H3 outline where each heading poses one question the section then answers. This lets the model map a query to a specific block instead of scanning prose. The verifiable result: your headings read like the questions a buyer would actually type.
Step 7 — Add FAQ sections (clear Q&A pairs)
AI engines rely heavily on clear question-and-answer pairs — they map almost one-to-one onto how people prompt. Add an FAQ section to priority pages with real questions and tight, self-contained answers. (This page ends with one; reuse the pattern.)
Step 8 — Add schema markup
Add structured data for rich-result eligibility and machine clarity: **Article, FAQ, Organization, HowTo, and Breadcrumb** are the high-value types. Treat schema as supporting, not mandatory — Google is explicit that schema is not required for AI features, but it helps classic search and removes ambiguity about what your page is.
Phase 4 — Measure (on a schedule, not once)
You can't improve what you can't see, and AI answers are non-deterministic — ask the same question twice and the wording shifts. So measurement is a repeatable loop, and the metric is not "rank." Track these three things.
- 1**Which prompts cite you.** Take your priority prompts, run each 3–5 times across ChatGPT and Perplexity, and log whether you appeared, your position in the answer, which competitors showed up, and which sources got cited. Repeat weekly — the trend, not any single run, is the signal. (This is the core of AI citation tracking.)
- 2**AI referral traffic.** Segment AI referrals in GA4 (filter sessions from ChatGPT, Perplexity, Gemini and Claude referrers) so you can see the real clicks an AI answer sends — and watch them grow.
- 3**Bing presence.** Keep monitoring your Bing index status, since it underpins ChatGPT visibility. A drop in Bing coverage is an early warning for ChatGPT.
There's also a higher-confidence signal most setups ignore: your own server logs. When an AI engine reads or cites your site, its crawler hits your pages and its answers send real referral clicks — that first-party traffic is ground truth, not a synthetic sample. The catch is that crawler user-agents are easy to spoof, so it has to be verified. SourceWatch measures both sides: mentions and share of voice across ChatGPT, Perplexity, Gemini and Claude, *and* the real (verified vs. spoofed) AI-crawler and AI-referral traffic landing on your site. There's also an MCP server, so you can pull your AI visibility straight into Claude Code.
Why bother? Because AI traffic punches far above its weight
It's low volume but high intent. Semrush (June 2025) found AI search traffic converts ~4.4x higher than organic for consideration queries. Search Engine Land's 13-month dataset found LLM referrals the highest-converting source at ~18%, despite being ~25x smaller than SEO/direct. Ahrefs put AI visitors at 0.5% of traffic but 12.1% of signups (~23x edge), and Microsoft Clarity (1,200+ sites) saw LLM visitors convert to signups at 1.66% vs 0.15% from search. With Gartner projecting traditional search volume to drop ~25%, being cited early compounds.
The llms.txt question (and a real disagreement to know about)
You'll see a lot of guides tell you to create an llms.txt file as a core AI-search move. Here's the honest, differentiated take — because the experts genuinely disagree, and you should know where things actually stand before spending time on it.
- **What it is:** llms.txt is a proposed standard from Jeremy Howard (September 2024) — a Markdown file that points LLMs to your key content. It's supported by some tools and adopted by various developer-docs sites.
- **Google says don't bother:** Google explicitly calls out "creating llms.txt files," content chunking, and AI-specific rewrites as things that do **not** help its AI features. That's a direct stance from the search engine behind AI Overviews.
- **The other engines haven't confirmed it:** OpenAI, Anthropic and Perplexity have not confirmed llms.txt as a ranking input. It's unverified, not proven harmful or helpful.
- **The honest position:** treat llms.txt as **low-cost and optional**, not a core lever. If it's cheap to add and you want the docs-discovery convenience, fine — but don't mistake it for a visibility hack, and don't prioritize it over phases 1–4.
The reassuring part
Notice that almost nothing in this checklist is an "AI hack." Open the gates, answer the question clearly, cite real sources, earn real authority, and measure. That's good content discipline aimed at a new surface — which is exactly why it holds up even as the engines change.
Common mistakes that keep you invisible
AI search is new enough that a lot of confident advice is wrong. The ones worth catching before they cost you:
- **Blocking AI crawlers by accident** — in robots.txt or at the WAF. This is the #1 cause of invisibility. Re-read phase 1.
- **Ignoring Bing** — it kills ChatGPT Search visibility even with great Google rankings. Verify in Bing Webmaster Tools.
- **Client-side-only rendering** — JavaScript content the bots literally cannot see. Confirm your answer text is in view-source HTML.
- **Burying the answer** — building up to it instead of answering in the first 40–60 words. Front-load every priority page.
- **Treating AI search as a one-time tweak** — it's an ongoing discipline. Answers drift, competitors move, freshness decays. Measure on a schedule.
- **Over-relying on llms.txt** — see the section above. Optional, not core, and explicitly dismissed by Google.
Where to go next
You've got the checklist. For the why behind it and the deeper per-engine playbooks:
- AI SEO: The Complete Guide — the hub: what GEO is, why it matters, and the strategy behind this checklist.
- How to Show Up in AI Search — the three-gates framework in depth.
- How to Rank in ChatGPT — the ChatGPT-and-Bing-specific playbook.
- How to Rank in Perplexity — the original-data and Reddit angle.
- How to Show Up in Google AI Search (AI Overviews) — the Google eligibility rules.
- AI crawlers reference — exact robots.txt directives for every bot.
Stop guessing whether AI engines can see and cite you. Run a free AI SEO audit, then track your visibility as you work this checklist.
Run a free AI SEO audit