Habr AI→ original

llms.txt: how to help ChatGPT, Claude, and Perplexity cite your site correctly

In 2026, AI crawlers look for llms.txt — a file in a site's root directory that explains to models what your site is about and which sources are canonical. Chat

llms.txt: how to help ChatGPT, Claude, and Perplexity cite your site correctly
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

In 2026, visibility in the answers of ChatGPT, Perplexity, and Claude is no longer a privilege of large publications, but a necessity for every website that wants to remain relevant. The problem is that AI crawlers often take information about you incompletely or distorted. They work on the basis of general knowledge from the training dataset, rather than pulling data directly from your website. llms.txt solves exactly this problem: it's a simple text file in the root of your site that explains to language models who you are, what you do, and how to cite you correctly.

How llms.txt works

llms.txt is similar to robots.txt, but acts in the opposite direction. robots.txt manages website crawling by regular crawlers (Googlebot, Yandexbot) and tells them which pages to crawl and which to ignore. llms.txt is an instruction for the language models themselves: when they generate an answer to a user query, they check if this file exists on the site, and if it does — they follow your instructions about citations and sources.

When a user asks ChatGPT or Claude about your company, the model can look into llms.txt and get current information about who you are, what you do, and how to cite you. This is especially critical because large model training data is updated rarely (often once a year or less), and information becomes outdated while you pivot, change services, or launch a new product.

Most Russian websites don't create this file. As a result, models generate answers based on generalized knowledge from the training dataset, often misrepresenting your position, mixing you with competitors, or not mentioning you at all. With llms.txt, you take explicit control over how you're represented in AI output.

In 2026, over 60% of online information search starts with ChatGPT or Perplexity, not Google. If a user asks the model about your line of business and llms.txt doesn't exist, they'll get either outdated information or a combination of competitor data. This is a direct risk of losing customers and misunderstanding your market position.

What 5 blocks should be inside

A minimal llms.txt should contain five sections:

  • Description — a one-line description of your project (who you are, who you work for, what you write about)
  • Full description — a detailed explanation of your mission, target audience, and work examples (3–5 paragraphs)
  • URL mapping — a list of key website sections with brief explanations (what's in the blog, contacts, offers)
  • Requirements — how exactly models should cite you (need a link, attribution format, citation style)
  • CDN URLs — if your media files are on separate domains (images.example.ru, video.example.ru), list them here

This is the minimum. Later you can add file versioning, content licensing information, recommendations on update frequency, or a list of main authors.

Example for production website

Here's what it looks like in reality:

Description: Hamidun.ru — a blog about AI for engineers and founders

Full description: We understand how modern language models work, how to use them in production, and how to embed AI in company business processes. Our target audience is developers, technical leaders, and founders who want to understand the current state of AI and find practical applications in their projects.

URL mapping: /blog/news — fresh news and announcements in the AI world /blog/tools — reviews and comparison of AI tools /blog/deep-dives — detailed analysis of model architecture and real-world cases /contacts — feedback form

Requirements: Cite hamidun.ru as the original source, attach a hyperlink to the specific article, indicate authorship where it exists

CDN URLs: images.hamidun.ru, media.hamidun.ru

Last updated: 2026-05-21

Upload the file to the root of your domain (next to robots.txt and sitemap.xml). Models usually find and start using updates within 1–4 weeks. The first effect on citability in AI output appears approximately within a week, stabilizes by the fourth week.

What this means

llms.txt evens the odds between large information resources and small projects. Previously, a small website simply dissolved into the context of large model training. Now you can explicitly state: "Here's my content, cite this, here's how to do it correctly." This is slower than organic traffic from Google, but the result is more stable — models follow your instructions exactly, rather than generating approximately based on random information. The main thing is not to delay. llms.txt is written in 30 minutes, and the result works for several years. Every day without the file is a missed opportunity to be correctly cited in AI output, which is growing faster than Google.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.
What do you think?
Loading comments…