#audio

52 results found

ElevenLabs MCP Server

Mirror of

MCP TTS Say

MCP Server Tool for Text To Speech

Video Extraction Server

MCP Video & Audio Text Extraction Server A Model Context Protocol (MCP) server that enables text extraction from various video platforms and audio files, allowing compatible host applications (like Claude Desktop, Cursor) to access video content and perform text transcription. What is it? MCP Video & Audio Text Extraction Server is a Model Context Protocol (MCP) server that can download videos from various platforms, extract audio, and convert it to text. The server utilizes OpenAI's Whisper model for high-quality audio-to-text conversion. How to use it? Clone the repository and install dependencies Ensure FFmpeg is installed Run the server Configure your MCP host application (like Claude Desktop) to use the server Key Features Support video downloads from multiple platforms including YouTube, Bilibili, TikTok, etc. Extract audio content from videos High-quality speech recognition using Whisper model Multi-language text recognition support Asynchronous processing for large files Standardized MCP tools interface Use Cases Provide text transcription capabilities for applications that need to process video content Batch process video content and extract text information Create custom applications requiring audio/video text extraction functionality Enable AI assistants to understand video content FAQ What are the system requirements to run the server? > Requires Python 3.9+, FFmpeg, minimum 8GB RAM, GPU acceleration recommended What should I know about first run? > The system will automatically download the Whisper model file (approximately 1GB), which may take several minutes to tens of minutes What audio formats are supported? > Supports common audio formats including mp3, wav, m4a, etc. This description maintains the core information from the original README while adopting a similar structure and style to the reference page. Would you like me to adjust or add anything to this description?

Audacity MCP Server

MCP server for Audacity

Pure Data MCP Server

A Model Context Protocol (MCP) server for Pure Data, an open-source visual programming language and patchable environment for real-time computer music.

MMAudio

AI-powered video-to-audio and text-to-audio generation using MMAudio's advanced AI technology.

Beatlyze

Analyze audio files and get BPM, key, energy, mood, and more. Send a URL or upload a file, get structured JSON results.

Claude Desktop Real-time Audio MCP

Real-time microphone input MCP server for Claude Desktop on Windows - enabling live voice conversations with Claude through WASAPI audio capture and real-time speech recognition

Claude Desktop Real-time Audio MCP Server (Python Implementation)

Python-based Model Context Protocol (MCP) server for real-time microphone input to Claude Desktop on Windows. FastMCP + sounddevice + multiple STT engines for sub-500ms latency voice conversations.

Storyflo

Curated audio-news MCP server. Search trending articles, fetch narrated audio, subscribe topic feeds. OAuth 2.1 + RFC 7591 Dynamic Client Registration. Free tier; premium briefings via x402 over stablecoin settlements.

gradio-transcript-mcp: A Gradio MCP Server for Audio/Video Transcription from URLs

Gradio demo cum MCP server to generate transcripts from Audio/Video

TTS MCP

An MCP server that gives AI assistants a physical voice by natively streaming synthesized text directly to your local system desk speakers. Features built-in persona mappings and supports providers like ElevenLabs, FishAudio, OpenAI and many more.

eShopLite 🛒

eShopLite is a set of reference .NET applications implementing an eCommerce site with features like Semantic Search, MCP, Reasoning models and more.

Sonos Mcp

MCP server for controlling Sonos speakers and playing audio streams.

Classic Books

Search and read classic books, world literature, and sacred religious texts across multiple sources. Supports Chinese classics (四书五经、四大名著、诸子百家) via Wikisource, Japanese literature via Aozora Bunko (青空文庫), Korean classics, daily Chinese poems, and free audiobooks from LibriVox. Multilingual support (zh/ja/ko/en/fr/de and more). Powered by Cloudflare Workers, free to use.

deAPI Mcp Server

MCP server for deAPI — 35 AI tools: transcription, image/video generation, TTS, music, OCR, embeddings via deAPI.

FeedNest

ChatGPT searches the internet. Perplexity crawls billions of pages. When you ask about your industry, your interests, your world, they give you everything except what matters. FeedNest is different. Your AI works exclusively with the sources you choose: the blogs you trust, the publications you follow, the experts you believe in. No algorithmic noise. No black-box results. Intelligence grounded in your world. This MCP server gives any AI assistant direct access to your feeds, articles, highlights, notes, and tags through natural language.

StudioSphere Pulse

Privacy-first audio intelligence for agents and creative audio workflows. Analyze public audio URLs for BPM, musical key, and waveform shape. Audio is processed ephemerally and not stored; pay per second with no account required for the public UI.

Supertonic3 Mcp

Local, on-device text-to-speech for Claude & Cursor. No API key, no cloud. Powered by Supertonic 3 — 10 voices, 31 languages, inline expression tags (<laugh>, <pause>, etc.). ~820ms synthesis on Apple Silicon. MIT licensed.

Boost.audio

AI audio tools for music producers — stem splitting, vocal removal, BPM/key detection, audio-to-MIDI, format conversion, trimming, video-to-audio, AI song generation

Compeller

Create AI music videos and audio-reactive visuals from songs through MCP.

Scriptivox

MCP server for Scriptivox AI transcription. Transcribe audio and video from URLs or local files in 119 languages, with speaker diarization, word-level timestamps, and SRT/VTT/text export. Bring transcription to Claude Desktop, Cursor, Cline, and any MCP-compatible client.

OnChain Music

Search and license 5,000+ fully cleared, independently owned music tracks via MCP. AI agents can find music by genre, mood, BPM, key, and instrumentation, then license instantly with USDC on Base. No human in the loop. Also includes free music industry knowledge tools: platform loudness standards, royalty split calculators, and genre conventions. Works for any music project, not just our catalog.

Build with ShipAny.