Abhishek ChaudharyAbhishek Chaudhary

Does llms.txt Actually Work in 2026? An Honest Case

Public 2026 data says llms.txt does not move AI citation numbers. I ship one on this site anyway, per-request and dynamic. Here is the honest case.

Abhishek Chaudhary11 min read

llms.txt is a plain-text file at the root of a website that gives large language model crawlers a compact, machine-readable summary of the site, following the spec at llmstxt.org. On paper, it is the AI-era answer to robots.txt for discovery instead of exclusion. In the data we have from 2026, it does not actually move citation numbers by itself. I still ship one on this site as a dynamic Next.js route handler. This post is the honest case for and against, and what I would change if the data shifts.

TL;DR

  • Public 2026 analyses from ALM Corp and SearchSignal both conclude llms.txt presence does not improve citation-likelihood predictors.
  • The file is still cheap to ship, and it doubles as a single canonical bio-and-discography surface that LLMs can ingest in one fetch instead of scraping ten pages.
  • I run it as a Next.js 16 dynamic route handler so new blog posts appear automatically.
  • What actually moves citations: the encyclopedic opener, FAQPage schema, named human byline, dated first-person specifics, and outbound links to authorities.
  • llms.txt is not the lever. Ship it anyway.

What the public data on llms.txt actually says in 2026

Two independent analyses in 2026 converged on the same finding: adding llms.txt to a site does not change its probability of being cited by ChatGPT, Perplexity, Claude, or Google AI Overviews. ALM Corp's data analysis ran the file through predictive citation models on a broad domain set and found it did not improve retrieval or weighting. SearchSignal's 2026 breakdown reached the same conclusion from a different angle, pointing out that the major LLM crawlers do not currently fetch llms.txt as part of their canonical crawl path.

The reason is structural. ChatGPT Search, Perplexity, Claude, and Google AI Overviews all use some combination of (a) a traditional search index, (b) live web fetch of the top-ranked URLs, and (c) retrieval from their own cached copies of the open web. None of the four consumes llms.txt as an authoritative source. The spec exists, the file gets downloaded by curious humans and a few experimental bots, and then nothing downstream happens.

This is not a secret. The llmstxt.org spec itself is proposed, not adopted. Treating it as an SEO lever because the marketing-tool space started selling "AI SEO" packages around it is exactly the pattern that produced the 2020 robots.txt-as-ranking-factor myth.

If your goal is to get cited more in AI search in 2026, llms.txt is not the lever. That is the starting position for every honest conversation about shipping one.

Why I ship one anyway: three reasons that survive the data

I still ship llms.txt on this site. Three reasons, in priority order.

First, it is cheap. The route handler at app/llms.txt/route.ts is a single file, returns a plain-text body with a one-hour cache header, and has no runtime dependencies beyond the blog-query function it already uses for the sitemap. I spent a short afternoon writing it and have not touched it since except to append new sections as the discography grew. The cost-benefit math is asymmetric: if one of the five major LLM surfaces starts respecting llms.txt in 2027, the site is already compliant, and if none of them ever does, I lose an afternoon.

Second, it doubles as the canonical one-file bio surface. The file opens with a 170-word summary of who the artist is, followed by a "Key Facts" block, a "Pages" directory, an "NCS" section explaining the CC BY 4.0 licence, a discography, press quotes, social profiles, and a dynamic "Blog Posts" section that auto-populates from the blog table. Any LLM that does ingest the file gets the whole identity graph in a single fetch. Contrast this with a model that has to scrape /about, /press, /ncs, and every track page individually to assemble the same picture. The compact summary beats the scrape every time, and even crawlers that do not officially respect the file may follow its links when they encounter it.

Third, it is a machine-readable enforcement of the brand voice. The file is where the canonical "polymath sentence" lives: the same one that appears in /about, in the site's Person JSON-LD description, and in the homepage subtitle. When an LLM trains on a scrape of the site, any version of that sentence it encounters is the version I wrote. Drift across surfaces is the single biggest brand-voice problem for a personal site, and the file is one of the four places where the canonical phrasing lives explicitly.

The dynamic route vs a static file: why I run it per-request

Most llms.txt examples on the open web are static files. I run mine as a Next.js 16 route handler at app/llms.txt/route.ts with export const dynamic = "force-dynamic". The reason is that a static file goes stale the moment I publish a new blog post or add a new track, and I am not going to remember to regenerate a static file every time that happens.

The dynamic handler reads getPublishedBlogs() at request time and appends a ## Blog Posts section with the title, URL, and excerpt of every published post. Today that list shows What Is NCS Music? The Full Guide for Indian Creators 2026, the musician-turned-founder cross-domain piece, and the SQLite vs Postgres stack post. When this post ships, it will appear in llms.txt on the next request, with no code change, no manual edit, and no cron job.

The same reasoning governs the sitemap and the robots file on this site. All three metadata surfaces run as dynamic route handlers because Next.js 16 static-prerenders anything without a dynamic API, and a statically prerendered build against an empty database would ship a file with zero blog entries. That is documented as CLAUDE.md rule 11 in the repo and it applies identically to llms.txt.

The one-hour Cache-Control header is a concession to crawler manners. If ten crawlers hit the route in the same minute, the CDN serves the cached response. One hour is long enough that the DB hit is effectively constant regardless of crawler volume, and short enough that a newly published post surfaces within an hour.

The llmstxt.org spec, line by line, for a personal brand site

The spec is thin. The required shape is:

# Project Name

> Brief description

## Pages
- [Title](url): optional description

## Optional
- Anything else

In practice the useful structure for a personal brand site is longer than the minimal spec but still lighter than a full bio page. My file uses:

  • The canonical artist name as the H1.
  • A blockquote with the canonical 170-word polymath description. This is the file's single most important paragraph because it is the part an LLM is most likely to ingest verbatim.
  • ## Key Facts with bullets for name, alternate spelling, location, roles, genres, career anchors (guitar at 15, first original at 20, Delhi rock era 2009 to 2013, solo catalog since 2013), formative influences, and a website URL.
  • ## Pages with every canonical surface on the site (music index, downloads, NCS, tags, years, about, press, blog, contact, license, DMCA, per-track template).
  • ## NCS with the CC BY 4.0 licence summary and the current free-to-use track names.
  • ## Discography with singles, albums, and tracklists.
  • ## Press & Recognition with the Grammy submission note and two third-party quotes.
  • ## Profiles with social links.
  • ## Optional with licensing and originality statements.
  • ## Blog Posts appended dynamically at request time.

The name-spelling disambiguation sits at the top of the file exactly once, inside the opening summary, with the variant spelling rendered parenthetically next to the canonical one. That placement is deliberate. The parenthetical disambiguation belongs on canonical identity surfaces (/about, /press, /dmca, /license, schema alternateName, and llms.txt) and nowhere in editorial prose. llms.txt is the fifth of those surfaces and the disambiguation appears exactly once.

What does NOT move citations in 2026, and what does

The honest ranking of what changes AI citation probability, based on the 2026 platform-citation-pattern research:

What does not move citations by itself:

  • llms.txt, per ALM Corp and SearchSignal.
  • Coining a new "framework" name in hopes of owning the phrase.
  • Keyword stuffing in page titles.
  • Stuffing the same FAQ block on every page (Google's scaled-content-abuse policy penalises this).
  • Switching URL paths between /blog/, /ideas/, /essays/.

What does move citations:

  • A 60-to-90-word encyclopedic opener on every page. ChatGPT extracts direct-answer chunks from the first ~60 words.
  • Named human byline. Anonymous or "Team" bylines are an E-E-A-T negative.
  • Dated first-person specifics the model could not synthesise (a year, a tool, a session count, a track name).
  • FAQPage JSON-LD on guide and reference posts where the questions are real reader queries.
  • Outbound links to independent authorities (Wikipedia, schema.org, government, academic). This is co-citation.
  • Freshness via honest dateModified on posts where facts actually changed.

Every one of those levers is work on the visible content and the JSON-LD, not the llms.txt. The file is a nice-to-have. The content is the only thing that moves the number.

What I would change if the data shifts

If one of the five major LLM surfaces starts respecting llms.txt as a first-class ingestion source in 2027, three things change immediately.

First, the file becomes a priority-refresh target. Today I touch it when a track or blog post ships. If it becomes load-bearing, I would touch it on every material change to the bio, the press quotes, or the discography, and I would add a dateModified equivalent in a comment at the top.

Second, I would expand the ## Key Facts block to include machine-readable tags for the bio (genres, languages, years active, location, roles) in a format a crawler can parse into structured data. The current bullets read clean to a human; a consuming LLM would do better with explicit key-value pairs.

Third, I would split long sections into their own referenced files. The spec allows the H1 file to link to deeper sub-files (e.g., /llms-discography.txt), which is cleaner than one monolithic document for sites with large catalogs.

None of this is urgent in April 2026. The data says llms.txt is not load-bearing yet. Shipping a clean one is still a five-hour one-time investment with no recurring cost, and the optionality is worth keeping.

FAQ

Does adding llms.txt actually improve LLM citation rates in 2026?

No, based on every public analysis I have read. ALM Corp and SearchSignal both ran citation-probability models with and without llms.txt and found no improvement. The major LLM search surfaces (ChatGPT Search, Perplexity, Claude, Google AI Overviews, Gemini) do not treat the file as a first-class ingestion source. If your goal is more citations, the levers that actually work are the encyclopedic opener, the FAQ JSON-LD, the named author byline, and the dated first-person specifics on every page. llms.txt is optional infrastructure, not an SEO trick.

What do Perplexity, Claude, and ChatGPT Search do with llms.txt in practice?

As of 2026, they do not do anything documented with it. Their crawlers are built on top of the traditional search-index layer, plus live web fetch of the URLs that rank for the query, plus their own model-training snapshots. None of that pipeline currently routes through llms.txt. A crawler might fetch the file opportunistically, and the file might inform an experimental retrieval system, but there is no publicly documented weighting path that honours it.

Should I still ship an llms.txt if the data says it does not work?

Yes, if it takes you under an afternoon and you can run it as a dynamic route rather than a static file. The file is cheap optionality. It costs almost nothing to maintain, it doubles as a compact one-file bio surface for any crawler that does ingest it, and it is compliant the day the first major LLM surface starts respecting it. What you should not do is spend days optimising it, paying for a "llms.txt SEO audit", or believing any vendor who says it will move your citations.

How is llms.txt different from robots.txt?

robots.txt is an exclusion protocol: it tells crawlers what not to fetch. llms.txt is a discovery summary: it tells LLMs what the site is about in a compact form. Both are plaintext at the site root, both are proposed rather than mandatory, and neither is a ranking factor on its own. The difference in 2026 is that robots.txt is universally respected by every major crawler (Googlebot, Bingbot, GPTBot, ClaudeBot, PerplexityBot all honour it), while llms.txt is not yet respected by any of them as a retrieval signal.

Does llms.txt need to be static or can it be dynamic per-request?

It can be dynamic. The spec does not say anything about static vs dynamic and the file is served over plain HTTP like any other resource. On this site I run it as a Next.js 16 dynamic route handler (app/llms.txt/route.ts with export const dynamic = "force-dynamic"), which appends the current published blog posts at request time. A static file would go stale the moment a new post ships. A dynamic route is the right default unless the site's content is genuinely unchanging.

Which LLM bots actually fetch llms.txt on a live site in 2026?

A handful of experimental bots and research crawlers fetch it in server logs I have seen, but none of the five major AI-search surfaces have publicly documented a pipeline that ingests it. The honest answer is "nobody load-bearing, as of April 2026." If your server logs show ClaudeBot, GPTBot, PerplexityBot, or Google-Extended fetching /llms.txt, that is almost always incidental: the bot is indexing the root and the file is just another URL in the crawl frontier, not a privileged input.