Write once, see everywhere
Most personal sites are barely parseable by agents. Here's the discovery layer I added to mine, and why I think it's the more interesting alternative to llms.txt.
Credit where it's due: I didn't invent any of this. The AI Catalog spec is a Linux Foundation effort by the A2A and MCP protocol communities. I'm just an early adopter with a personal site and too much spare time.
Most personal sites are optimized for speed, SEO, and accessability amongst other things. They render fine for a human, but an agent trying to summarize who you are and what you've written has to parse the HTML, summarize the content, fill in the blanks and the structure.
I wanted mine to be different. Not because I expect a flood of agent traffic, but because I am optimizing for both humand and non-human organic traffic. I can't rememeber the last time I did a web search myself for technical documentation, I sometime visit a website directly to confirm documentation after my agent hydrates the context with it.
The boring layer first
Before any of the agent-specific stuff, the site ships the obvious things:
- Sitemap XML files, robots.txt, etc
- Annotation like JSON-LD for entity schemas
- Proper HTML tags like titles, description etc
None of this is new. All of it still matters. The boring stuff is non-negotiable because agents and LLM orchestrators still use web search and they can only take in some much context so you still need to account for proper SEO hygiene, unless someone creates an agent search engine (idea)
The LLM/Agent layer
On top of that, I added two well-known endpoints:
/.well-known/ai.txtwhich is a human readable policy: who I am, what's allowed, what isn't about the content/.well-known/ai-catalog.jsonis a machine readable catalog of AI-related resources for the site. It can advertise AI policies, content feeds, licensing information, MCP servers, agents, APIs, models, and other machine consumable resources so that AI systems can discover them from a single well-known endpoint. The catalog is the part I find most interesting because it allows us to use a one well known URL, many discoverable resources, no need to redefine the format every time I add something new.
Why not just llms.txt
llms.txt is fine as well but it bakes everything into one flat document and assumes the only thing an agent wants is a curated reading list.
The catalog approach is more setup up front, but it scales the way a sitemap does. one entry point, a list of resources behind it, and adding a new one later (an MCP server, a model card, an API) doesn't break the contract.
Neither is a standard yet. Both are bets. I'm betting on the catalog because it doesn't lock me in, if llms.txt ends up winning later I can just advertise it from the catalog and move on.
What I'd tell someone copying this
ship the boring layer first. the agent specific stuff is a bonus on top of it, not a replacement for it.
and don't hand-write the catalog, it will drift. generate it from the same source of truth that feeds your sitemap and your feed.
The takeaway
we'll see which approach wins. shipping both is cheap. not shipping anything means an agent gets to decide who you are, and it usually decides wrong.