sound recognition - 梅斯AI导航站

SSE-based Server and mobile Angular App

MCP server for image recognition with Angular mobile client app.

download ffmpeg

originally was going to be an mcp server, now it's a stupid soundcloud scraper

Mcp Mindmesh

Claude 3.7 Swarm with Field Coherence: A Model Context Protocol (MCP) server that orchestrates multiple specialized Claude 3.7 Sonnet instances in a quantum-inspired swarm. It creates a field coherenc

MindMesh MCP Server

Claude 3.7 Swarm with Field Coherence: A Model Context Protocol (MCP) server that orchestrates multiple specialized Claude 3.7 Sonnet instances in a quantum-inspired swarm. It creates a field coherenc

Asr_mcp_server

A Model Context Protocol (MCP) server that provides ASR(Automatic Speech Recognition) capabilities using the whisper engine. This server exposes TTS functionality through MCP tools, making it easy to

Whisper Speech Recognition MCP Server

A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.

Entity Identificationn

Recognize whether two sets of data are from the same entity.

MCP Image Recognition Server

An MCP server that provides image recognition 👀 capabilities using Anthropic and OpenAI vision APIs

Spark-TTS

<p>Overview Spark-TTS 是由出门问问（Mobvoi）联合多所顶尖学术机构（如香港科技大学、上海交通大学）最新推出的新一代语音合成模型，其核心创新在于BiCodec编码技术和与文本大模型的结构统一性，利用大型语言模型 (LLM) 的强大功能实现高度准确且自然的语音合成。</p> <p>Spark-TTS is an advanced text

AI Text Humanizer

An AI text humanizer transforms AI-generated content into natural, human-like text. It adds flow, uses conversational phrasing, and avoids robotic language. Our humanization tool helps create engaging

ThinkSound

ThinkSound是阿里通义语音团队推出的首个CoT（链式思考）音频生成模型，用在视频配音，为每一帧画面生成专属匹配音效。模型引入CoT推理，解决传统技术难以捕捉画面动态细节和空间关系的问题，让AI像专业音效师一样逐步思考，生成音画同步的高保真音频。模型基于三阶思维链驱动音频生成，包括基础音效推理、对象级交互和指令编辑。模型配备AudioCoT数据集，包含带思维链标注的音频数据。在VGGSoun

搜索结果