语音转文本 - 梅斯AI导航站

Repurpose LOL

Platform to convert audio/video to transcripts, clips, and posts.

AssemblyAI

AssemblyAI provides AI models for transcribing and understanding speech through a user-friendly API.

ChatScribe Pro

AI-powered transcription, translation, content generation, and Q&A

Audiotype - Audio Transcription and Video Subtitles

Automatic transcription software for businesses and organizations.

Dubformer

AI Dubbing & localization for media industry

My Speaking Score

Prepare for TOEFL Speaking with speech assessment tools and ETS® SpeechRater™ scoring engine.

TalkNotes

Transcribe, clean, and structure your voice into usable content.

Better Speech Online Speech Therapy

Convenient, effective & affordable online speech therapy.

TranscribeMe

Convert voice notes from WhatsApp and Telegram to text with TranscribeMe for free.

Countless.dev

Compare and evaluate various AI models and their specifications.

Voiser

Voiser is an AI program that converts text to speech and speech to text with human-like voices.

Rask AI

Rask AI provides top-quality AI video dubbing and localization with 130+ languages.

Voicenotes.com

Dump your thoughts. Perfect memory.

Happy Scribe

Audio to text transcription and video subtitles with high accuracy.

Free Transcription Tool Deepgram

Free AI transcription tool for converting audio to text.

Deepgram Voice AI

Real-time speech-to-text and text-to-speech APIs powered by Deepgram's voice AI models

HitPaw Edimakor

AI video editor with advanced features

Kimi-Audio

Kimi-Audio，这是一个开源音频基础模型，在音频理解、生成和对话方面表现出色。此存储库包含 Kimi-Audio 的官方实现、模型和评估工具包。通用功能：处理语音识别（ASR）、音频问答（AQA）、音频字幕（AAC）、语音情感识别（SER）、声音事件/场景分类（SEC/ASC）和端到端语音对话等多种任务。最先进的性能：在众多音频基准测试中取得 SOTA 结果（参见评估和技术报告）。

speechmatics

使用值得信赖的语音转文本技术构建语音 AI 医疗保健产品

meeting minutes

专注于您的对话，同时 Meetily 的 AI 会自动捕获、转录和总结您的会议记录。100% 开源、自托管和隐私优先 - Granola 和 Otter AI 的完美替代品。适用于 Google Meet、Zoom 和 Teams 只需单击一下即可捕获现场会议音频人们说话时的实时转录人工智能生成的摘要和行动项目 100%开源，完全透明自托管以实现完全数据控制 100% 私

spatial speech translation

空间语音翻译：利用双耳可听设备进行跨空间翻译 🗣️ 空间语音翻译 CHI 2025 论文“空间语音翻译：利用双耳可听设备进行跨空间翻译”的官方仓库 Youtube 视频演示： 💡 功能我们首先实现多说话人和干扰条件下的语音翻译。我们的同步和富有表现力的语音翻译模型可以在 Apple 芯片上实时运行。首先，语音翻译的双耳渲染可以保留从输入到翻译输出的空间提示。 📑 开源

RealtimeVoiceChat

一款开源的实时AI语音聊天助手：RealtimeVoiceChat，语音听起来相对自然，支持打断双向语音交互，延迟低，可以实时看到语音转录，以及AI的回复内容用来构建客服、教育或陪伴等等场景的AI语音助手比较实用为低延迟交互而构建的复杂客户端-服务器系统： 🎙️捕获：您的声音被您的浏览器捕获。 ➡️流：音频块通过 WebSockets 传输到 Python 后端。 ✍️转

AgenticSeek

类似 Manus 但基于 Deepseek R1 Agents 的本地模型。 Manus AI 的本地替代品，它是一个具有语音功能的大语言模型秘书，可以 Coding、访问你的电脑文件、浏览网页，并自动修正错误与反省，最重要的是不会向云端传送任何资料。采用 DeepSeek R1 等推理模型构建，完全在本地硬体上运行，进而保证资料的隐私。 Features： 100% 本机运行:

Unmute

Unmute 是 Kyutai 推出的低延迟语音交互系统，专注于低延迟语音转文字（Speech-to-Text）和文字转语音（Text-to-Speech）。Unmute 基于先进的 AI 模型，为用户提供实时、高效的语音交互体验。用户基于语音与 AI 进行交流，支持将文字内容快速转换为自然流畅的语音输出。Unmute 的低延迟处理能力，能实现无缝的语音交互。 Unmute的主要功能

搜索结果