Skip to content
Glossary

What is llms.txt?

llms.txt is a proposed web standard: a single Markdown file you place at your site's root (`yourdomain.com/llms.txt`) that hands large language models a short, curated map of your most important content. Think of it less like robots.txt — which controls what crawlers may touch — and more like a curated sitemap of only your best pages: a treasure map that says "to understand this brand, start here." It was proposed by Jeremy Howard (co-founder of Answer.AI) in September 2024, and it sits alongside GEO and llms.txt-aware AI crawlers in the modern AI visibility toolkit.

TL;DR

  • **llms.txt is a Markdown file at your domain root** that points AI models to the handful of pages that best explain your brand.
  • It's for **inference time** — when an LLM is answering a question — not for training or for ranking in Google.
  • It does **not** block, control, or grant access to anything. That's robots.txt's job. llms.txt only *recommends*.
  • It's opt-in and advisory: no AI provider is obligated to read it, and major crawlers rarely fetch it today.
  • There's **no proven citation lift** from the file itself — treat it as cheap, low-risk infrastructure, not a growth lever.

llms.txt, defined

An LLM's context window is too small to swallow your entire website, and raw HTML — packed with navigation, ads and JavaScript — is messy to read. llms.txt solves both problems by letting *you*, the site owner, say "here is the good stuff, already curated." It's a plain Markdown file that lists your most important pages, each with a short description, so a model answering a question about your category has a clean starting point instead of guessing from scraped HTML.

It was proposed in September 2024 by Jeremy Howard, and the spec lives at llmstxt.org. Critically, it's aimed at **inference time** — the moment an AI engine like ChatGPT, Perplexity or Claude is composing an answer — not at training the model or at getting you indexed in Google.

A treasure map, not a fence

Search Engine Land put it well: llms.txt isn't robots.txt. robots.txt is a fence that blocks or permits crawlers. llms.txt is a treasure map that says "start digging here." It recommends; it never restricts.

What an llms.txt file looks like

The format is deliberately Markdown, not XML, so both humans and models can read it. The spec defines a strict order, but only the first element is required:

  1. 1

    An H1 with your site or project name

    The only required element. Everything below it is optional but recommended.

  2. 2

    A blockquote summary

    One short paragraph (prefixed with >) describing what your site or brand is.

  3. 3

    Optional context sections

    Plain Markdown paragraphs or lists adding detail a model would find useful — any block type except headings.

  4. 4

    H2 "file list" sections

    Bulleted links in the form Page name: a one-line note — your curated list of best pages.

One section name carries special meaning. An `## Optional` H2 marks links a model **can safely skip** when it needs a shorter context. It's the only section name the spec gives defined semantics.

llms.txt vs llms-full.txt

A companion convention, llms-full.txt, concatenates your entire documentation into one large Markdown file — handy for pasting straight into a coding assistant. llms.txt is the curated index of links; llms-full.txt is the whole library. Many docs platforms also let you append .md to a page URL (e.g. /pricing → /pricing.md) to serve a clean Markdown version of that page.

How to create one

It's a plain text file — you can write it in any editor, or generate it. The hard part isn't the syntax; it's the curation. Resist the urge to dump every URL. Hand-pick the 5–10 pages that actually define your brand:

  • Your **homepage** and **about/company** page — who you are.
  • Your **pricing** page — how you're bought.
  • Your **core product or feature** pages — what you do.
  • Your **key docs or guides** — how you're used.
  • For each, write a **one-line description** so the model knows why it matters.

Save it as `llms.txt` and place it at your domain root so it resolves at `https://yourdomain.com/llms.txt`. That's it — curation beats completeness every time.

Don't want to hand-write it? Run a free, one-page AI audit — it checks your AI-crawler access, entity recognition and answer-readiness, the things that actually move AI visibility, in about 15 seconds.

Run a free AI audit

llms.txt vs robots.txt vs sitemap.xml

These three files are easy to confuse, but they do completely different jobs. They coexist — llms.txt replaces neither of the others.

FileIts jobIn one word
robots.txtTells crawlers which paths they may or may not accessExclusion
sitemap.xmlLists every indexable URL so search engines discover them allDiscovery
llms.txtPoints AI models to your best content for answering questionsCuration

robots.txt and sitemap.xml are about *access* and *indexing for search*. llms.txt is about *understanding and curation for AI answers*. robots.txt blocks or permits; llms.txt only recommends — and unlike robots.txt, it has no enforcement. An AI provider can ignore it entirely. If you actually want to control which bots reach your site, that's a job for robots.txt and AI-crawler rules, not llms.txt.

Does llms.txt actually work? A reality check

This is where honesty matters more than hype. As of early 2026, the evidence for llms.txt moving the needle is thin — and you should know that before you over-invest.

  • **Adoption is low and AI bots rarely fetch it.** In an SE Ranking study of roughly 300,000 domains, only about 10% had an llms.txt file. Across 62,000+ AI-bot visits, the file was targeted in only about 0.1% of them — major bots like GPTBot, ClaudeBot and PerplexityBot showed essentially no requests for it.
  • **No measured citation lift.** A 10-site before/after study found no independent effect of llms.txt on whether LLMs cited those sites; the gains that did appear traced back to content, PR and technical fixes, not the file.
  • **Google doesn't use it.** Google's John Mueller has said no AI service has confirmed using llms.txt — and that you can tell from server logs they don't even check for it. Google's AI Overviews and AI Mode draw from the regular Search index, so the file does nothing for Google.

So why publish one?

Because it's cheap, low-risk infrastructure, and the strongest use case — documentation sites feeding coding assistants — genuinely works. Real adopters include Anthropic, Hugging Face, Perplexity, Zapier, Cursor and Windsurf. Publish it, keep your expectations grounded, and don't treat it as a ranking or citation lever.

How to tell if it's helping

Since the file itself has no proven citation effect, the only way to know whether *anything* you do is working is to measure the outcome that matters: are AI engines actually citing you? That means tracking your **mention rate** and **share of voice** across ChatGPT, Perplexity, Gemini and Claude, and watching the first-party AI-crawler and referral traffic landing on your site. SourceWatch measures exactly this — so you can see whether your AI visibility moves after you publish an llms.txt, instead of assuming it did. If you'd rather skip the manual file, our llms.txt generator builds a clean, curated one from your site in seconds.

Frequently asked questions

What is an llms.txt file?

It's a proposed web standard — a single Markdown file at your domain root that gives large language models a curated, plain-text map of your most important pages. It's meant to help AI engines understand and reference your content at inference time, not to control crawler access or affect Google rankings.

Source: The /llms.txt specification
Is llms.txt the same as robots.txt?

No. robots.txt controls which paths crawlers may access — it blocks or permits. llms.txt does neither; it only recommends your best content to AI models, has no enforcement, and is purely voluntary for AI providers to read. They do different jobs and coexist.

Source: llms.txt isn't robots.txt — Search Engine Land
Does Google use llms.txt?

No. Google's John Mueller has said none of the AI services have confirmed using llms.txt, and that you can tell from server logs they don't even check for it. Google's AI Overviews and AI Mode pull from the regular Search index, so the file has no effect on Google.

Source: Does llms.txt matter? — Search Engine Land
Does publishing llms.txt improve my AI rankings or citations?

There's no evidence it does. A before/after study across 10 sites found no independent effect of the file on LLM citations, and a larger study of ~300,000 domains found AI bots fetch it in only about 0.1% of visits. Treat it as low-cost, opt-in infrastructure — its strongest, proven use case is documentation sites feeding coding assistants — not as a citation lever.

Source: llms.txt shows no clear effect on AI citations (300k domains) — Search Engine Journal
What's the difference between llms.txt and llms-full.txt?

llms.txt is a curated index of links to your most important pages. llms-full.txt concatenates your entire documentation or content into one large Markdown file, useful for pasting directly into a coding assistant. One is a map; the other is the whole library.

Source: Simplifying docs with /llms.txt — Mintlify
How do I create an llms.txt file?

Write a plain Markdown file with an H1 of your site name, a one-line blockquote summary, and a short bulleted list of your 5–10 most important pages — each with a one-line description. Save it as llms.txt and place it at your domain root so it loads at yourdomain.com/llms.txt. Curation beats completeness: pick your best pages, don't dump every URL.

Source: The /llms.txt specification
Where should I place the llms.txt file?

At your domain root, exactly like robots.txt, so it resolves at yourdomain.com/llms.txt. AI tools that support the convention look for it there. A file buried in a subfolder won't be discovered.

Is llms.txt worth doing in 2026?

It's worth a small, one-time effort — it's cheap, low-risk, and the documentation-for-coding-assistants use case genuinely works. Just keep expectations realistic: adoption is low, major AI bots rarely fetch it, and there's no proven citation lift. Publish it as infrastructure, then measure your actual AI visibility — mention rate and share of voice across ChatGPT, Perplexity, Gemini and Claude — to see what truly moves the needle.

Further reading

Keep reading

See whether AI engines actually cite you — across every model.

Connect your first site and watch SourceWatch score your AI visibility in minutes.