
Doc Scraper
Supports UIby Sriram-PR
Convert technical documentation sites into clean Markdown for LLM ingestion and RAG pipelines.
What it does\nDoc Scraper is a high-performance Go-based web crawler specifically designed to transform complex documentation websites into structured Markdown files. It eliminates web clutter, preserves site hierarchy, and optimizes content for Large Language Models (LLMs), making it an essential tool for building RAG (Retrieval-Augmented Generation) systems.\n\n## Tools\n- list_sites: Lists all configured sites from the config file.\n- get_page: Fetches a single URL and returns content as markdown.\n- crawl_site: Starts a background crawl for a specific site.\n- get_job_status: Checks the progress of a background crawl job.\n- search_crawled: Searches previously crawled content within JSONL files.\n\n## Installation\nAdd the following to your claude_desktop_config.json:\njson\n{\n \"mcpServers\": {\n \"doc-scraper\": {\n \"command\": \"/path/to/doc-scraper\",\n \"args\": [\"mcp-server\", \"-config\", \"/path/to/config.yaml\"]\n }\n }\n}\n\n\n## Supported hosts\nConfirmed support for Claude Desktop, Cursor, and Claude Code.
Quick install
go install github.com/Sriram-PR/doc-scraper/cmd/doc-scraper@latestInformation
- Pricing
- free
- Published
- 4/18/2026
- stars






