Python-based Model Context Protocol (MCP) server for real-time microphone input to Claude Desktop on Windows. FastMCP + sounddevice + multiple STT engines for sub-500ms latency voice conversations.
Overview
What is Claude Desktop Real-time Audio MCP Server?
Claude Desktop Real-time Audio MCP Server is a Python-based server that facilitates real-time microphone input for Claude Desktop on Windows, enabling fast voice conversations with low latency.
How to use Claude Desktop Real-time Audio MCP Server?
To use the server, clone the repository, set up a virtual environment, install dependencies, configure your audio settings and STT engines, and run the server.
Key features of Claude Desktop Real-time Audio MCP Server?
- Real-time audio capture with sub-500ms latency.
- Supports multiple speech-to-text engines including OpenAI Whisper, Azure Speech, and Google Speech-to-Text.
- Easy configuration through JSON/YAML files and environment variables.
- Comprehensive logging and performance monitoring.
- Async architecture for non-blocking operations.
Use cases of Claude Desktop Real-time Audio MCP Server?
- Enabling voice-driven interactions with Claude Desktop.
- Real-time transcription of spoken language into text.
- Voice activity detection for improved audio processing.
FAQ from Claude Desktop Real-time Audio MCP Server?
-
What platforms does it support?
It supports Windows 10/11 and requires Python 3.8 or higher.
-
Is it free to use?
Yes, it is open-source and available under the MIT License.
-
How can I contribute?
Contributions are welcome, especially in areas like additional STT engines and cross-platform support.