关键词 "Multimodal understanding" 的搜索结果, 共 20 条, 只显示前 480 条
The core MCP extension for Systemprompt MCP multimodal client
A document reading, search, and metadata server to provide access to PDFs (and, in the future, other formats) to LLMs via the Model Context Protocol (MCP)
A knowledge base and experimental playground for exploring the Model Context Protocol (MCP)—understanding how hosts, servers, LLMs, and tools interact.
A suite of Model Context Protocol (MCP) servers designed to enhance AI agent capabilities. Provides tools for media search/understanding (images, video), web information retrieval, PDF generation, and
A Model Context Protocol (MCP) server that enables AI assistants to generate images, text, and audio through the Pollinations APIs. Supports customizable parameters, image saving, and multiple model o
MCP for AWS Cost Explorer and CloudWatch logs
awsome kali MCPServers is a set of MCP servers tailored for Kali Linux, designed to empower AI Agents in reverse engineering and security testing. It offers flexible network analysis, target sniffing,
MCP Server to assist LLMs and humans on Model Context Protocol spec compliance and understanding
A powerful, production-ready context management system for Large Language Models (LLMs). Built with ChromaDB and modern embedding technologies, it provides persistent, project-specific memory capabili
MCP architecture demo with single-server & multi-server clients using LangGraph, AI-driven tool calls, and async communication.
MCP server for OpenRouter providing text chat and image analysis tools
A multimodal mcp server
MCP server for understanding AWS spend
server that shows trending tokens and integrates Grok, xAI image understanding and vision (interpreted as a vision-capable AI), and Claude's computer use capabilities.
昆仑万维正式开源(17B+)Matrix-Game大模型,即Matrix-Zero世界模型中的可交互视频生成大模型。Matrix-Game是Matrix系列在交互式世界生成方向的正式落地,也是工业界首个开源的10B+空间智能大模型,它是一个面向游戏世界建模的交互式世界基础模型,专为开放式环境中的高质量生成与精确控制而设计。 空间智能作为AI时代的重要前沿技术,正在重塑我们与虚拟世界的
Nexus-Gen:图像理解、生成和编辑的统一模型,开源届的GPT-4o平替 待办事项 发布训练和推理代码。 发布模型检查点。 发布技术报告。 发布训练数据集。 什么是Nexus-Gen Nexus-Gen 是一个统一模型,它将 LLM 的语言推理能力与扩散模型的图像合成能力协同起来。为了对齐 LLM 和扩散模型的嵌入
MMaDA(Multimodal Large Diffusion Language Models)是普林斯顿大学、清华大学、北京大学和字节跳动推出的多模态扩散模型,支持跨文本推理、多模态理解和文本到图像生成等多个领域实现卓越性能。模型用统一的扩散架构,具备模态不可知的设计,消除对特定模态组件的需求,引入混合长链推理(CoT)微调策略,统一跨模态的CoT格式,推出UniGRPO,针对扩散基础模型的统
只显示前20页数据,更多请搜索
Showing 145 to 164 of 164 results