Developer14 min read@codewitholgun

How to Make Your Website AI-Agent Ready in 2026 (Full Checklist)

Tags:AI AgentsGEODeveloper ToolsAI SEOMCPWebMCP

How to Make Your Website AI-Agent Ready

Publish seven machine-readable entry points and your site becomes discoverable and usable by AI agents: an llms.txt, RFC 8288 Link headers, an RFC 9727 API catalog at /.well-known/api-catalog, an MCP Server Card at /.well-known/mcp/server-card.json, an Agent Skills index at /.well-known/agent-skills/index.json, Content Signals in robots.txt, and Markdown-for-Agents content negotiation. This guide walks through each one with copy-paste examples you can adapt today. FindUtils (findutils.com) implements every single one, and our AI Agent Starter Guide shows the full setup interactively.

If you only have 10 minutes, skip to the Minimum Viable Agent-Ready Site section near the end — it lists the three highest-leverage files to ship first.

Why Agent-Readiness Matters in 2026

AI agents make up a fast-growing slice of web traffic and they don't read websites the way humans do.

  • AI search engines (ChatGPT, Claude, Perplexity, Gemini, Bing Copilot) cite sites with structured, machine-readable content far more often.
  • MCP clients (Claude Desktop, Cursor, Zed, Windsurf) look for an MCP Server Card before they'll connect to your tools.
  • Autonomous agents built on LangChain, CrewAI, the Agents SDK, or raw tool-calling loops discover APIs by fetching /.well-known/api-catalog.
  • Browser agents running WebMCP-enabled Chromium read navigator.modelContext.provideContext() to invoke page-level tools without a human click.
  • Training and citation pipelines respect Content Signals in robots.txt.

A site that publishes none of these is invisible to the agent layer. A site that publishes all of them becomes a first-class programmable surface.

The Seven-File Agent-Ready Stack

FilePurposeSpec
/llms.txtLLM-friendly site overviewllmstxt.org
/robots.txt + Content-Signal:Train/search/input preferencescontentsignals.org
/.well-known/api-catalogAPI discovery linksetRFC 9727
/.well-known/mcp/server-card.jsonMCP server descriptorSEP-1649
/.well-known/agent-skills/index.jsonSkills discovery indexagentskills.io RFC v0.2.0
Link: response headersInline discovery for all the aboveRFC 8288
Accept: text/markdown negotiationMachine-readable page variantCloudflare Markdown-for-Agents

Optional but high-leverage:

  • navigator.modelContext.provideContext() for browser-side tool exposure (WebMCP)
  • OpenAPI 3.1 spec at /api/openapi.json
  • /.well-known/oauth-protected-resource if your APIs are authenticated (RFC 9728)

Step 1: Publish /llms.txt

Open your Robots.txt Generator and your favorite text editor. Create public/llms.txt (or the equivalent for your platform) with a short markdown document that lists your site's purpose and key pages.

MD
1
2
3
4
5
6
7
8
9
10
11
> Example Corp (https://example.com) builds X for Y. All processing is client-side.

## Pages

- [Pricing](https://example.com/pricing): Plans and limits
- [Docs](https://example.com/docs): Developer documentation
- [API](https://example.com/api): REST endpoints

## AI-friendly
- llms.txt: https://example.com/llms.txt
- llms-full.txt: https://example.com/llms-full.txt (optional, expanded)

AI crawlers fetch this before diving into your HTML. It's your elevator pitch to the model.

Step 2: Add Content Signals to robots.txt

Open your site's robots.txt and add a Content-Signal: directive as the first content line. This declares your AI content usage preferences per the contentsignals.org / IETF draft-romm-aipref-contentsignals specification.

1
2
3
4
5
# Content Signals — declare AI content usage preferences
Content-Signal: search=yes, ai-train=yes, ai-input=yes

User-agent: *
Allow: /

Three signals, each yes or no:

  • search — allow indexing in search engines
  • ai-train — allow use in AI training datasets
  • ai-input — allow use as context/input in AI answers (citation)

If you want AI to cite you but not train on your content, use search=yes, ai-train=no, ai-input=yes. If you run a free public resource like FindUtils, opt in across the board.

Step 3: Publish /.well-known/api-catalog (RFC 9727)

Create a JSON file at /.well-known/api-catalog with MIME type application/linkset+json. Each entry anchors an API and links to its service-desc (OpenAPI), service-doc (human docs), and optionally status (health endpoint).

JSON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "linkset": [
    {
      "anchor": "https://api.example.com/",
      "service-desc": [
        { "href": "https://example.com/api/openapi.json",
          "type": "application/vnd.oai.openapi+json;version=3.1" }
      ],
      "service-doc": [
        { "href": "https://example.com/api", "type": "text/html" }
      ],
      "status": [
        { "href": "https://api.example.com/health", "type": "application/json" }
      ]
    }
  ]
}

Agents discover your API by fetching this one file. They no longer need to crawl your docs to find the spec.

Step 4: Publish an MCP Server Card

If you run an MCP server, publish a Server Card at /.well-known/mcp/server-card.json. The spec is standardized at SEP-1649.

JSON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "protocolVersion": "2025-03-26",
  "serverInfo": {
    "name": "example",
    "title": "Example MCP Server",
    "version": "1.0.0",
    "description": "12 utilities for X and Y.",
    "homepage": "https://example.com",
    "license": "MIT"
  },
  "transports": [
    { "type": "streamable-http", "url": "https://mcp.example.com/" }
  ],
  "capabilities": { "tools": { "listChanged": false } },
  "authentication": { "required": false },
  "rateLimit": { "perMinute": 120, "perDay": 1000 }
}

MCP clients that support discovery (Claude Desktop, Cursor, MCP Inspector) can import your server by URL alone — they fetch the card and know exactly how to connect.

Step 5: Publish an Agent Skills Index

The Agent Skills Discovery RFC (agentskills.io, v0.2.0) defines a format for exposing machine-readable skills — step-by-step playbooks an agent can follow.

Create /.well-known/agent-skills/index.json:

JSON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
  "$schema": "https://raw.githubusercontent.com/cloudflare/agent-skills-discovery-rfc/main/schemas/index.schema.json",
  "version": "0.2.0",
  "name": "Example Agent Skills",
  "skills": [
    {
      "name": "search-catalog",
      "type": "markdown",
      "description": "How to search the Example catalog via the REST API.",
      "url": "https://example.com/.well-known/agent-skills/search-catalog/SKILL.md",
      "sha256": "7b96a62daec09466fb3faa3fccd09770664412326803bd3489aba52e611435e0"
    }
  ]
}

Then create each SKILL.md with frontmatter and a step-by-step markdown body. Compute the sha256 digest so agents can verify the file hasn't been tampered with:

shasum -a 256 public/.well-known/agent-skills/search-catalog/SKILL.md

Every HTML page on your site should return Link: headers pointing at the resources above. On Cloudflare Pages this lives in _headers:

1
2
3
4
5
6
7
/*
  Link: </.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json"
  Link: </api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1"
  Link: </api>; rel="service-doc"; type="text/html"
  Link: </llms.txt>; rel="describedby"; type="text/plain"
  Link: </.well-known/agent-skills/index.json>; rel="https://agentskills.io/rels/skills-index"; type="application/json"
  Link: </.well-known/mcp/server-card.json>; rel="https://modelcontextprotocol.io/rels/server-card"; type="application/json"

On Nginx:

add_header Link '</.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json"';
add_header Link '</api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1"' always;

On Express:

JS
1
2
3
4
5
6
7
8
app.use((req, res, next) => {
  res.setHeader('Link', [
    '</.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json"',
    '</api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1"',
    '</llms.txt>; rel="describedby"; type="text/plain"',
  ].join(', '));
  next();
});

Verify with:

curl -sI https://your-site.com/ | grep -i ^link

Step 7: Add Markdown-for-Agents Content Negotiation

When a client sends Accept: text/markdown, serve a markdown version of the page instead of HTML. On Cloudflare Pages the easiest route is an Advanced-Mode _worker.js:

JS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
export default {
  async fetch(request, env) {
    const accept = request.headers.get('accept') || '';
    const wantsMarkdown = /text\/markdown/i.test(accept) && !/text\/html/i.test(accept);

    const url = new URL(request.url);
    if (wantsMarkdown && (url.pathname.endsWith('/') || !url.pathname.includes('.'))) {
      const mdUrl = new URL(
        (url.pathname.endsWith('/') ? url.pathname : url.pathname + '/') + 'markdown.md',
        url.origin
      );
      const mdResp = await env.ASSETS.fetch(new Request(mdUrl, request));
      if (mdResp.ok) {
        const headers = new Headers(mdResp.headers);
        headers.set('Content-Type', 'text/markdown; charset=utf-8');
        headers.set('Vary', 'Accept');
        return new Response(mdResp.body, { status: 200, headers });
      }
    }
    return env.ASSETS.fetch(request);
  },
};

Generate the sibling markdown.md files at build time. Static site generators can emit them via templates; for Astro, add a dynamic route that renders the same content as markdown.

Test it:

curl -H "Accept: text/markdown" https://your-site.com/some-page/

You should get a Content-Type: text/markdown response with the page's content as clean markdown.

Step 8: Optional — Add WebMCP Browser Tools

If your site has actions that make sense from an agent (search, navigate, execute), expose them via WebMCP. Add this to your site-wide layout:

HTML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<class="text-rose-400">script>
if (navigator.modelContext?.provideContext) {
  navigator.modelContext.provideContext({
    tools: [
      {
        name: 'searchCatalog',
        description: 'Search the Example catalog by keyword.',
        inputSchema: {
          type: 'object',
          properties: { query: { type: 'string' } },
          required: ['query']
        },
        execute: (args) => {
          const url = '/search?q=' + encodeURIComponent(args.query);
          window.location.href = url;
          return { url };
        }
      }
    ]
  });
}
</class="text-rose-400">script>

WebMCP-enabled browsers (Chrome Origin Trial) will expose searchCatalog to any running agent on your page. Non-WebMCP browsers ignore the call (the feature-check guards it).

Real-World Scenarios

Scenario 1: You run a SaaS with a REST API

  1. Ship /llms.txt describing your product.
  2. Publish /api/openapi.json (most API frameworks emit this automatically).
  3. Add /.well-known/api-catalog pointing at the OpenAPI spec.
  4. Add Link: service-desc headers.

Result: agents building integrations discover your API in one fetch instead of scraping your docs.

Scenario 2: You run a content site (blog, docs, news)

  1. Ship /llms.txt with your top pages.
  2. Add Content Signals (typically search=yes, ai-train=yes, ai-input=yes for public content).
  3. Add Markdown-for-Agents negotiation.
  4. Add Link: describedby pointing at llms.txt.

Result: ChatGPT, Claude, and Perplexity cite you more often because your content is cheaper and cleaner to extract.

Scenario 3: You run a developer tool (like FindUtils)

  1. Ship all seven files above.
  2. Add an MCP server wrapping your core functions — JSON-RPC 2.0 over HTTP is the simplest transport.
  3. Publish the MCP Server Card.
  4. Ship SKILL.md files for common agent workflows.
  5. Add WebMCP tools for in-browser invocation.

Result: agents can use your tools as first-class callable surfaces, not just reference material. This is what FindUtils (findutils.com) did — AI Agent Starter Guide walks through the setup with live examples.

Agent-Ready vs Not-Agent-Ready: Tool Comparison

FeatureAgent-ready siteNon-agent-ready site
Time to integrate into an agentMinutes (one curl)Days (custom scrapers, DOM parsing)
AI citation frequencyHigh (structured extraction)Low (HTML noise + ad clutter)
Works with Claude Desktop / Cursor out of the boxYes (MCP Server Card)No (requires custom server)
Works with browser agentsYes (WebMCP)No
Cost per token to LLMs citing your contentLow (markdown variant)High (HTML/JS noise)
Future-proof as new agent protocols landIncremental addsFull rebuild each time

Competitor Comparison: How Agent-Readiness Tools Stack Up

ApproachEffortCoverageFree?
FindUtils open-standard stack (this guide)~1 dayAll agent protocolsYes
Custom scrapers per siteWeeks per siteAgent-specificYes (labor cost)
Third-party "agent gateway" SaaSLow setupProxy, not nativeUsually paid
Do nothingNoneInvisible to agentsFree (but costly to visibility)

The open-standard stack is the highest-leverage option because every new MCP client, every new autonomous agent, and every new AI search engine builds on the same specs. You ship the files once; everyone benefits forever.

Common Mistakes and Fixes

Mistake 1: Serving /.well-known/api-catalog as application/octet-stream

Files without an extension frequently get the wrong MIME type. Pin application/linkset+json; charset=utf-8 explicitly in your host's headers config. Verify:

curl -sI https://your-site.com/.well-known/api-catalog | grep -i content-type

Mistake 2: Forgetting to compute sha256 for SKILL.md files

The Agent Skills index requires a sha256 digest for each skill. Recompute on every change or agents will reject stale entries:

shasum -a 256 public/.well-known/agent-skills/*/SKILL.md

Mistake 3: Setting ai-train=no on a public resource

Unless you're monetizing your content directly, ai-train=no cuts you off from training data that later becomes model knowledge. Most public sites benefit from ai-train=yes.

Mistake 4: WebMCP tools that require user confirmation

WebMCP tools should be idempotent and side-effect-free by default. Navigation and search are fine. Don't expose deleteAccount or purchase via WebMCP — those need a human in the loop.

Mistake 5: Returning the same HTML for Accept: text/markdown

Content negotiation is worthless if the markdown variant is just HTML with a different header. Generate real markdown — strip nav, ads, and JS chrome.

Minimum Viable Agent-Ready Site

If you only have an afternoon, ship these three files:

  1. /llms.txt — a short markdown document describing your site and key pages.
  2. Content-Signal: directive — one line at the top of robots.txt.
  3. Link: <llms.txt>; rel="describedby"; type="text/plain" header — one header in your host's config.

That's it. You've just become more agent-discoverable than 99% of the web. Everything else in this guide stacks on top.

Tools Used in This Guide

  • AI Agent Starter Guide — Interactive playground for Claude Code, Copilot, Cursor, Gemini, Codex, Windsurf
  • Robots.txt Generator — Build a robots.txt with Content Signals in seconds
  • AI Model Picker — Compare Claude, GPT, Gemini, and local models by context window and price
  • LLM Requirements Calculator — Estimate RAM, VRAM, and hardware needs for local models
  • Claude Code Usage Analyzer — Parse your Claude Code session logs and visualize usage
  • FindUtils Tool API — A working example of RFC 9727, OpenAPI 3.1, and api-catalog in production
  • FindUtils MCP Server — A working example of an MCP Server Card at /.well-known/mcp/server-card.json

Next Steps

  • Read the companion post: One of the Most Agent-Ready Websites on the Internet — the real-world rollout FindUtils did in one day.
  • Scan your own site with isitagentready.com to see which files you're missing.
  • Build an MCP server wrapping your core API — JSON-RPC 2.0 over HTTP is the simplest transport.
  • Submit your updated sitemap and llms.txt to IndexNow so search + AI crawlers pick up the changes within hours.

FAQ

Q1: Is making a website agent-ready free? A: Yes. Every protocol in this guide is an open standard with free reference implementations. The files are small — the total payload across all seven agent-readiness files on FindUtils (findutils.com) is under 15 KB. You pay only in developer time.

Q2: Do I need to run an MCP server to be agent-ready? A: No. An MCP server is required only if you want agents to execute tools on your site. Pure content sites (blogs, docs, news) become agent-ready with just llms.txt, Link headers, Content Signals, and Markdown-for-Agents negotiation.

Q3: What's the single highest-leverage agent-readiness file? A: llms.txt at the site root. It takes 30 minutes to author, requires no infrastructure changes, and is the first thing AI search engines look for. Every other file on the checklist is a multiplier; llms.txt is the base.

Q4: How do I test my agent-ready setup? A: Three checks. (1) curl -sI https://yoursite.com/ | grep -i ^link should show multiple Link headers. (2) curl -s https://yoursite.com/.well-known/api-catalog | jq should return valid linkset+json. (3) curl -H "Accept: text/markdown" https://yoursite.com/some-page/ should return markdown, not HTML. Also run isitagentready.com for an external audit.

Q5: Will adding these files slow down my site? A: No. The static files are under 15 KB total and cached by the CDN. The Link response headers add ~400 bytes per HTML response. Markdown-for-Agents negotiation runs at the edge with no round-trip to origin. Measurable impact: zero.

Q6: Is WebMCP production-ready in 2026? A: WebMCP is an Origin-Trial Proposal in Chromium browsers as of 2026. Ship the navigator.modelContext.provideContext() call behind a feature check — it's a no-op in browsers that don't implement the API and a functional integration in those that do. Zero-risk progressive enhancement.

Q7: Do I need to authenticate my MCP server? A: Only if your tools mutate data or cost money to run. Public, pure-computation tools (like FindUtils' 54 utilities) are safe to expose unauthenticated with rate limits. If you need auth, publish /.well-known/oauth-protected-resource per RFC 9728 and follow the OAuth 2.0 discovery flow.

Q8: How does agent-readiness interact with GEO and traditional SEO? A: It complements both. GEO (Generative Engine Optimization) focuses on content structure — answer capsules, comparison tables, FAQs. Agent-readiness focuses on discovery infrastructure — machine-readable entry points. Traditional SEO focuses on search rankings. The three stack: a GEO-optimized site with an agent-ready discovery layer and solid SEO fundamentals wins on all three channels.

Q9: What happens if agent-readiness standards evolve? A: The foundational pieces (Link headers, llms.txt, OpenAPI) are stable and backwards-compatible. Newer specs (WebMCP, Content Signals, Agent Skills) are versioned — declare your version in the JSON and update when the spec changes. FindUtils commits to updating the stack as specs mature, and this guide gets refreshed with each major change.