This project provides a toolset to crawl websites wikis, tool/library documentions and generate Markdown documentation, and make that documentation searchable via a Model Context Protocol (MCP) server, designed for integration with tools like Cursor.
Overview
What is MCPDocSearch?
MCPDocSearch is a toolset designed to crawl websites, generate Markdown documentation, and make that documentation searchable via a Model Context Protocol (MCP) server, facilitating integration with tools like Cursor.
How to use MCPDocSearch?
To use MCPDocSearch, you first run the crawler_cli to crawl a website and generate a Markdown file. Then, you run the mcp_server to load and serve the documentation, allowing clients like Cursor to query the content.
Key features of MCPDocSearch?
- Web Crawler (
crawler_cli): Configurable crawling of websites with options for depth, URL patterns, and HTML cleaning. - MCP Server (
mcp_server): Loads Markdown files, parses them into semantic chunks, and exposes tools for searching and retrieving documentation. - Cursor Integration: Designed for seamless operation with Cursor, allowing for easy querying of documentation.
Use cases of MCPDocSearch?
- Crawling and documenting API references from various websites.
- Creating searchable documentation for internal company resources.
- Integrating with tools like Cursor for enhanced documentation accessibility.
FAQ from MCPDocSearch?
- Can MCPDocSearch crawl any website?
Yes, as long as the website allows crawling and follows the robots.txt rules.
- Is there a limit to the crawl depth?
Yes, the maximum crawl depth is configurable, typically between 1 and 5.
- How do I integrate MCPDocSearch with Cursor?
You need to configure a
.cursor/mcp.jsonfile in the project root with the appropriate settings for the MCP server.