#vision
29 results found
🚀 OpenCV MCP Server
OpenCV MCP Server provides OpenCV's image and video processing capabilities through the Model Context Protocol (MCP). Access powerful computer vision tools for tasks ranging from basic image manipulation to advanced object detection and tracking.
MCP OpenVision
MCP Server using OpenRouter models to get descriptions for images
groundlight-mcp-server
MCP Server for Groundlight
MCP Server for CVDLT(Computer Vision & Deep Learning Tools)
The repo is based on Model Context procotol of Python SDK, including DL models in CV, and provide the abilities to the LLM or vLLM model
🚀 Wayland MCP Server
MCP Server for Wayland
MCPControl
MCP server for Windows OS automation
Snaprender Url To Screenshot
Screenshot API for AI agents. Capture any website as PNG, JPEG, WebP, or PDF with a single tool call. Supports full-page capture, device emulation (iPhone, iPad, Pixel, MacBook), dark mode, ad blocking, cookie banner removal, custom viewports, and CSS selector hiding. Includes cache checking (free, doesn't count against quota) and real-time usage monitoring. Stealth mode defeats most bot detection. Free tier: 50 screenshots/month, no credit card required.
Apple RAG MCP
Transform your AI agents into Apple development experts! Apple RAG MCP gives you instant access to official Swift docs, design guidelines, and comprehensive Apple platform knowledge through cutting-edge RAG technology. With professional AI reranking and hybrid search across iOS, macOS, watchOS, tvOS, and visionOS documentation plus Apple Developer YouTube content, you'll get precise, contextual answers every time. Compatible with Cursor, Claude Desktop, and all MCP tools - start building smarter Apple apps today!
LibreChat
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.
AutoProvisioner MCP Server (open beta)
Mirror of
UI-TARS Desktop
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Trend Vision One MCP Server
The Trend Vision One Model Context Protocol (MCP) Server enables natural language interaction between your favourite AI tooling and the Trend Vision One web APIs. This allows users to harness the power of Large Language Models (LLM) to interpret and respond to security events.
UI-TARS Desktop 🚀
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Vision Mcp Server | 图片分析 Mcp
This MCP addresses the visual recognition limitations of text-based models by enabling accurate image description and identification, making it excellent for AI-assisted reference design interface analysis. It currently supports dropping links into the dialog box or placing images in the project folder for recognition. The tool can be integrated with MCP platforms like Claude Code, Cline, and Trae. Beyond programming applications, it also provides visual recognition capabilities for models that lack native image processing functionality. For visual models, users can select their preferred model from ModelScope community and replace it during MCP configuration setup. 📱 Daily Use Cases: Send screenshots to directly identify errors or issues Share image links or place screenshots in the project folder for AI-assisted layout optimization Submit product image links to generate promotional copy 该mcp可以解决文字模型图片识别的视觉的问题,可以准确识别描述图片,用来给AI看参考设计界面很nice~ 目前支持丢链接到对话框,以及把图片放到项目文件夹进行识别。 支持加入到Claude Code,Cline和Trae等mcp工具中。 除了编程外,如果你使用的模型本身不支持视觉图片识别,也可以使用~ 视觉模型可以自己去魔搭社区选一个自己喜欢的,在填写mcp配置的时候替换即可 📱 日常使用场景 - 截图发过去,直接告诉哪里出错了 - 丢过去一个图片链接或者截图放到项目文件夹内,让AI帮忙优化布局 - 发个产品图链接,让AI写推广文案
Asterwise — Astrology MCP Server
MCP server for Vedic astrology. Connect Claude, ChatGPT, or any MCP-compatible AI to real ephemeris calculations. Covers natal charts, 5-level Vimshottari Dasha, yoga detection with BPHS citations, matchmaking with Rajju/Vedha vetoes, panchanga, KP system, Lal Kitab, and numerology. OAuth 2.1. Free sandbox tier — 500 calls/month, no credit card.
Frametrace | Reverse Video Search
FrameTrace is an AI-powered reverse video search engine that helps you find any video's original source, detect duplicates, and verify authenticity across platforms like YouTube, TikTok, Instagram, and Reddit. Using advanced computer vision and machine learning, it analyzes videos frame-by-frame to trace content origins even after editing or re-encoding.
Roboflow
Create, train, and deploy computer vision models.
Kelnix Receipt Mcp Api
Description: Turn any receipt into structured, accounting-ready JSON or clean Markdown with one API call. AI-powered vision extracts merchant, date, line items, tax breakdown, totals, currency, and confidence scores — then suggests the right GL account for instant bookkeeping. 7 tools for the full receipt-to-journal-entry pipeline. Built for expense automation agents. 50 free credits on signup, no credit card required.
Superdocs
A structured-document editor for AI agents. SuperDocs gives your AI 21 MCP tools and 4 workflow prompts to make section-precise edits — bold a specific paragraph, replace a single table cell, restructure a heading — without disturbing surrounding content. Tables, borders, alternating row shading, fonts, and inline styling all survive AI edits AND round-trip exports across .docx, PDF, HTML, Markdown, and RTF. Other capabilities: pre-signed URL upload/download (no context bloat for files >100KB), compact response mode for editing 100-page documents efficiently (~140× token reduction), multimodal vision on attachments, human-in-the-loop approval for sensitive edits, and multi-language editing across 16+ languages. Free plan: 500 ops/month, no credit card required.