MCP (Model Context Protocol) Support¶

ServeMD includes built-in support for the Model Context Protocol (MCP), enabling LLMs like Claude to interactively query your documentation instead of loading everything into context.

Why MCP?¶

Traditional approaches like llms.txt and llms-full.txt dump entire documentation into context, which:

Wastes tokens - A 500KB documentation site uses ~125K tokens
Hits context limits - Large documentation may exceed context windows
Lacks precision - LLMs must search through all content for relevant info

MCP provides on-demand search and retrieval:

250x less context - Typical queries use ~2KB instead of 500KB
Precise results - Full-text search with relevance scoring
Scales infinitely - Works with 10 or 10,000 documentation pages

Quick Start¶

1. Enable MCP (Default: Enabled)¶

# MCP is enabled by default
# To disable:
MCP_ENABLED=false

2. Test the Endpoint¶

# Initialize handshake
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": "1",
    "method": "initialize",
    "params": {
      "protocolVersion": "2024-11-05",
      "capabilities": {},
      "clientInfo": {"name": "curl", "version": "1.0"}
    }
  }'

# List available tools
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": "2",
    "method": "tools/list"
  }'

In-Page Search (Human Users)¶

When MCP is enabled, ServeMD adds an in-page search experience for human readers:

Search bar in the topbar (configurable via {{search}} in topbar.md — see Navigation)
Search page at /search?q=... with live results as you type (debounced, min 3 characters)
Keyboard shortcut — press / to focus the search bar from anywhere (except when typing in an input)
Escape — blurs the search input

Search uses the same Whoosh index as the MCP search_docs tool. Results show titles, snippets, and links to documentation pages. Search terms are highlighted in pale yellow.

Example: Visit http://localhost:8080/search?q=configuration to search for "configuration".

Available Tools¶

ServeMD exposes three MCP tools:

search_docs¶

Search documentation with full-text search powered by Whoosh.

Features: - Fuzzy search (typo tolerance): configration~ finds "configuration" - Boolean operators: auth AND login, rate OR limit, config NOT debug - Field-specific: title:API, content:authentication - Phrase search: "rate limiting"

Searching structured identifiers:

Structured IDs found in headings (UC-2-002, AUTH-01, G-02, KEV-123, #002 …) are indexed as exact tokens with a high relevance boost. Type the identifier as-is — no quotes needed:

UC-2-002          → finds the heading that defines it (top result)
AUTH-01           → finds the auth screen / route entry
KECMAP2-1234      → works even when the prefix contains digits

Rules of thumb: - Exact beats fuzzy for IDs — skip the ~ suffix - Mixed letter+digit tokens (e.g. v2, OAuth2, Screen01) are also boosted when they appear in headings - Filename fragments work too: AUTH_01 matches a file like screens/AUTH_01_login.md

Request:

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "tools/call",
  "params": {
    "name": "search_docs",
    "arguments": {
      "query": "rate limiting",
      "limit": 5
    }
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": "1",
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Found 3 result(s):\n\n1. **API Endpoints** (`api/endpoints.md`)\n   Category: api\n   Score: 15.50\n   Rate limiting is enforced at 120 requests...\n\n..."
      }
    ]
  }
}

get_doc_page¶

Retrieve the full content of a specific documentation page, optionally filtered to specific sections.

Request (full page):

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "tools/call",
  "params": {
    "name": "get_doc_page",
    "arguments": {
      "path": "api/endpoints.md"
    }
  }
}

Request (specific sections):

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "tools/call",
  "params": {
    "name": "get_doc_page",
    "arguments": {
      "path": "api/endpoints.md",
      "sections": ["GET /health", "Rate Limiting"]
    }
  }
}

list_doc_pages¶

List all available documentation pages, optionally filtered by category.

Request (all pages):

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "tools/call",
  "params": {
    "name": "list_doc_pages",
    "arguments": {}
  }
}

Request (filtered by category):

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "tools/call",
  "params": {
    "name": "list_doc_pages",
    "arguments": {
      "category": "api"
    }
  }
}

Configuration¶

Environment Variable	Default	Description
`MCP_ENABLED`	`true`	Enable/disable MCP endpoint
`MCP_RATE_LIMIT_REQUESTS`	`120`	Max requests per window per IP
`MCP_RATE_LIMIT_WINDOW`	`60`	Rate limit window in seconds
`MCP_MAX_SEARCH_RESULTS`	`10`	Default max search results
`MCP_SNIPPET_LENGTH`	`200`	Max characters for snippets

Rate Limiting¶

The MCP endpoint is rate-limited to prevent abuse:

Default: 120 requests per 60 seconds per IP address
Response on limit: JSON-RPC error with retry information

{
  "jsonrpc": "2.0",
  "id": "1",
  "error": {
    "code": -32603,
    "message": "Rate limit exceeded",
    "data": {
      "retryAfter": 60,
      "limit": "120/60s"
    }
  }
}

Search Index¶

ServeMD builds a Whoosh search index on startup:

First start: ~500ms to index (100 docs)
Subsequent starts: ~10ms to load from cache
Cache location: CACHE_ROOT/mcp/whoosh/
Automatic rebuild: When docs change (hash-based validation)

The index includes: - Document paths (unique identifier) - Titles (2× boost for relevance) - Full content (for search and snippets) - Section headings h2–h4 (1.5× boost) — h3/h4 entries are where individual use-case and screen IDs live - Structured identifiers extracted from headings (5× boost) — any mixed letter+digit token such as UC-2-002, AUTH-01, G-02, KECMAP2-1234 - File path fragments — each path segment is tokenised so AUTH_01 matches the filename - Categories (from directory structure)

Integration Examples¶

Quick install: Visit /servemd on this site for one-click install buttons for Cursor and VS Code, plus a ready-to-copy config snippet for Claude Desktop.

Claude Desktop¶

Add to your Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "my-docs": {
      "type": "http",
      "url": "https://your-docs-site.com/mcp"
    }
  }
}

See Settings > Developer > Edit Config in Claude Desktop to open the file directly.

Cursor¶

Due to a known regression in Cursor v3.0.9+ where native HTTP MCP transport is broken, use mcp-remote as a bridge:

{
  "mcpServers": {
    "my-docs": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://your-docs-site.com/mcp"]
    }
  }
}

Or visit /servemd for a one-click install button that generates this config automatically.

n8n / Make.com¶

Use HTTP Request nodes to call the MCP endpoint:

Set URL: https://docs.example.com/mcp
Method: POST
Headers: Content-Type: application/json
Body: JSON-RPC request

Custom LLM Applications¶

import httpx

async def search_docs(query: str) -> str:
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://docs.example.com/mcp",
            json={
                "jsonrpc": "2.0",
                "id": "1",
                "method": "tools/call",
                "params": {
                    "name": "search_docs",
                    "arguments": {"query": query}
                }
            }
        )
        return response.json()["result"]["content"][0]["text"]

Error Codes¶

MCP uses standard JSON-RPC 2.0 error codes:

Code	Meaning	Common Causes
`-32700`	Parse error	Invalid JSON in request
`-32600`	Invalid request	Missing `method` or `jsonrpc`
`-32601`	Method not found	Unknown MCP method
`-32602`	Invalid params	Validation error, file not found
`-32603`	Internal error	Rate limit, index not ready

Comparison: MCP vs llms.txt¶

Feature	llms.txt	llms-full.txt	MCP
Context size	~10KB	~500KB	~2KB
Precision	Low	Low	High
Interactive	No	No	Yes
Real-time	No	No	Yes
Scales to	100 pages	50 pages	Unlimited

Recommendation: Use MCP for interactive queries, llms.txt for quick overviews, and llms-full.txt for offline/batch processing.

CLI Tools¶

ServeMD includes command-line tools for managing the MCP search index.

Available Commands¶

# Build or rebuild the search index
uv run python -m docs_server.mcp.cli build

# Force rebuild (ignore cache)
uv run python -m docs_server.mcp.cli build --force

# Validate cached index
uv run python -m docs_server.mcp.cli validate

# Show index statistics
uv run python -m docs_server.mcp.cli info

# Clear cached index
uv run python -m docs_server.mcp.cli invalidate

# Clear without confirmation
uv run python -m docs_server.mcp.cli invalidate --confirm

Command Details¶

build¶

Builds the search index from documentation files. If a valid cache exists, it will be used unless --force is specified.

$ uv run python -m docs_server.mcp.cli build
2026-01-31 13:48:58 [INFO] Building MCP search index...
2026-01-31 13:48:58 [INFO] DOCS_ROOT: /app/docs
2026-01-31 13:48:58 [INFO] CACHE_ROOT: /app/cache
2026-01-31 13:48:58 [INFO] ✅ MCP index built (184.4ms, 14 docs)

validate¶

Checks if the cached index is valid and can be used.

$ uv run python -m docs_server.mcp.cli validate
2026-01-31 13:48:59 [INFO] Validating MCP search index cache...
2026-01-31 13:48:59 [INFO] ✅ Cache is valid
2026-01-31 13:48:59 [INFO]    Index version: 1.0
2026-01-31 13:48:59 [INFO]    Documents: 14

info¶

Shows detailed information about the index including configuration, statistics, and cache status.

$ uv run python -m docs_server.mcp.cli info
2026-01-31 13:49:02 [INFO] MCP Search Index Information
2026-01-31 13:49:02 [INFO] ============================================================
2026-01-31 13:49:02 [INFO]
📋 Configuration:
2026-01-31 13:49:02 [INFO]   DOCS_ROOT:    /app/docs
2026-01-31 13:49:02 [INFO]   MCP_ENABLED:  True
...

invalidate¶

Clears the cached index and metadata. The next server startup will rebuild the index.

$ uv run python -m docs_server.mcp.cli invalidate
This will delete: /app/cache/mcp
Are you sure? [y/N]: y
2026-01-31 13:49:02 [INFO] ✅ Cache cleared successfully

Use Cases¶

Pre-build index for production:

# In Dockerfile or deployment script
uv run python -m docs_server.mcp.cli build

Verify cache after deployment:

uv run python -m docs_server.mcp.cli validate && echo "Ready"

Debug index issues:

uv run python -m docs_server.mcp.cli info

Force rebuild after doc changes:

uv run python -m docs_server.mcp.cli build --force

Troubleshooting¶

"Search index not initialized"¶

The search index hasn't finished building. This can happen if: - Server just started (wait a few seconds) - Index build failed (check logs) - MCP_ENABLED=false is set

Solution: Check index status with uv run python -m docs_server.mcp.cli info

"Rate limit exceeded"¶

You've exceeded 120 requests/minute. Wait for the retryAfter period or adjust:

MCP_RATE_LIMIT_REQUESTS=300 MCP_RATE_LIMIT_WINDOW=60

No search results¶

Check that your docs are in DOCS_ROOT
Verify files have .md extension
Try broader search terms
Check for typos (or use fuzzy search: term~)
For structured IDs (UC-2-002, AUTH-01 …) type the identifier exactly — they are indexed verbatim from headings and match with high precision

Debug: Run uv run python -m docs_server.mcp.cli info to check indexed document count

Cache validation fails¶

If the cache keeps rebuilding on every startup: - Check that DOCS_ROOT path is consistent - Verify file permissions on CACHE_ROOT/mcp/ - Look for file modification time issues (e.g., volume mounts)

Solution: Force rebuild with uv run python -m docs_server.mcp.cli build --force