AEO

llms.txt: the plain-text site index AI agents read first

llms.txt is a plain-text file that tells AI systems what your website contains and where to find the most important content. Without it, AI agents have to guess. Most businesses have not added one yet.

Duncan Hotston·

llms.txt is a plain-text file that sits in the root of your website and tells AI systems what your site contains. It is written in simple Markdown, readable by both machines and humans, and it gives AI agents a direct map of your most important content before they start working anything out for themselves. Most business websites do not have one. That is a straightforward problem with a straightforward fix.

The map analogy

Think about what happens when a delivery driver arrives in a new area without a postcode. They can find you eventually, but it takes longer, they might get it wrong, and they might give up and go to the address that was easier to locate. llms.txt is the postcode. It does not make your business more important. It makes your business findable by systems that would otherwise have to guess.

AI agents are not browsing your website the way a person does. They are reading it at speed, looking for signals that tell them what you do, who you serve, and whether you are worth citing in a response. Without clear signposting, they either skip you or get you wrong. Both outcomes cost you business.

What llms.txt actually contains

The file is short by design. It typically includes:

  • A brief description of what the business or website is
  • A list of key URLs the AI agent should prioritise
  • Optional: descriptions of what each linked page contains
  • Optional: a pointer to a more detailed companion file, llms-full.txt, for agents that want deeper context

Here is a simplified example of what the structure looks like:

# Acme Consulting

> B2B strategy consultancy specialising in supply chain and operations.

Key pages

  • Services: Full list of consulting services offered
  • About: Background, credentials, and team
  • Case Studies: Client work and outcomes
  • Contact: How to get in touch

That is it. No code. No database. A plain-text document a competent copywriter could draft in an afternoon, assuming they know which pages matter and why.

Why the timing matters

This is not a format that has been around for decades. llms.txt was proposed in late 2024 specifically in response to how large language models consume web content. It is new enough that adoption is low. Businesses that implement it now are doing so while most competitors have not thought about it yet.

That gap will close. It always does. The businesses that set up structured data for Google in 2012 had a meaningful advantage for several years before everyone caught up. The same pattern applies here, with the difference that AI search is growing faster than traditional SEO ever did.

What llms.txt does not do on its own

llms.txt is one layer, not the whole answer.

It tells AI agents where to look. It does not tell them what to think about you. That job belongs to the other layers: structured data that defines your business entity in a format AI systems understand natively, entity signals that establish your credibility and consistency across the web, and WebMCP that allows AI agents to interact with your business directly.

The beknown.world 5-Layer Framework treats llms.txt as Layer 3, positioned after crawlability and structured data because those two layers have to work before an AI agent will trust what llms.txt points to.

A directory listing is only useful if the rooms it refers to actually exist and are in order.

Common mistakes businesses make

Listing every page. llms.txt is not a sitemap. It is a priority signal. If you include forty URLs, the agent has no way of knowing which three actually matter. Pick the pages that define your business and leave the rest out.

Writing it for humans. The descriptions attached to each URL should use clear, categorically specific language. "Our services" is less useful than "consulting services for logistics and warehouse operations businesses". Specificity is what helps an AI agent match you to a relevant query.

Setting it up and forgetting it. If you add a new service, change your pricing model, or launch a case studies section, the file needs to reflect that. A stale llms.txt is worse than a vague one because it actively misdirects agents that trust it.

Treating it as a standalone fix. Businesses that add only llms.txt and nothing else typically see modest results. The file works best as part of a complete signal stack. On its own, it is a useful nudge. Combined with the other four layers, it is a meaningful advantage.

How to know if it is working

AI visibility is not tracked the same way web traffic is. You cannot open a dashboard and see "llms.txt referrals: 47 this week". What you can measure is whether AI tools are citing you accurately, whether the pages they reference match the ones you want them to reference, and whether AI-attributed traffic to your site is increasing over time.

The 16-Probe Scan checks whether your llms.txt file exists, whether it is correctly structured, and whether the pages it references are themselves set up to support AI citation. It is a starting point, not a verdict, but it tells you where you stand before you start making changes.

The plain summary

llms.txt is a small file that does a specific job. It tells AI agents what your site contains and which parts are worth their attention. It takes a short time to create and almost no time to maintain. The businesses that have it are already one layer ahead of those that do not. The ones that combine it with the remaining layers of the 5-Layer Framework are the ones that appear consistently when someone asks an AI tool to recommend a business like yours.

You can check whether your site has llms.txt and how it scores across all 16 signals at beknown.world/check. The scan is free and takes under a minute.


Frequently asked questions

What is llms.txt?

llms.txt is a plain-text file placed in the root of a website that tells AI language models what the site contains and which pages matter most. It is structured in simple Markdown and is read by AI agents before they crawl or summarise your content. Think of it as a table of contents written specifically for AI systems.

Is llms.txt the same as robots.txt?

No. robots.txt tells search engine crawlers which pages they are allowed to access. llms.txt tells AI agents what your site is about and which pages are worth reading. They serve different audiences and different purposes, though both live in the same root directory.

Does every business website need an llms.txt file?

Any business that wants to appear in AI-generated responses benefits from having one. Without it, AI agents have to infer your content structure from scratch. That inference is slower, less accurate, and more likely to result in your business being overlooked in favour of a competitor who has made things clearer.

How does llms.txt affect what AI says about my business?

AI tools like ChatGPT and Perplexity pull from sources they can read efficiently. A well-structured llms.txt file points those tools directly to your most important pages: your services, your credentials, your location, your pricing. The clearer the signposting, the more likely you are to be cited accurately.

How hard is it to add llms.txt to a website?

The file itself is simple plain text. Creating it takes less than an hour if you know your site well. The harder part is knowing what to include, how to structure the signals, and how to connect the file to the other layers of AI visibility. That is what beknown.world handles as part of the 5-Layer Framework.

Will llms.txt guarantee I appear in AI search results?

No single file guarantees anything. llms.txt is one layer of a broader signal stack. It works best when combined with structured data, entity signals, and WebMCP. Businesses that implement all five layers consistently are the ones that appear reliably in AI-generated recommendations.

How do I know if my site already has an llms.txt file?

Type your domain into a browser followed by /llms.txt, for example yourbusiness.com/llms.txt. If you see a plain-text page with structured content, you have one. If you see a 404 error, you do not. The beknown.world 16-Probe Scan checks this automatically as part of a full visibility assessment.

llms.txtAI visibilityAEOAI searchstructured signalsentity signals

Check your AI visibility

Find out how AI search engines see your business. Free check, no commitment.

Get your free check