
Mathpix – AI文档转换工具
<p style="text-align: left; line-height: 2;">Mathpix 是先进的光学字符识别(OCR)工具,专注于将手写或打印的数学公式、化学方程式和表格从图像和PDF文件中提取、转换为可编辑的格式,如LaTeX和Markdown。工具为学生、教师、研究人员和开发者提供强大的API和生产力应用,支持多种语言和深度STEM功能。Mathpix提供企业级的安全转换服务,及专为科研人员设计的协作编辑环境,极大地促进科学交流和文档数字化。</p><h2 style="text-align: left;">Mathpix的官网地址</h2><ul><li style="text-align: left;"><strong>官网地址</strong>:https://mathpix.com/</li></ul><h2 style="text-align: left;">Mathpix的产品定价</h2><ul><li style="text-align: left;">免费计划:适用于学生和日常用户,包含10页PDF,10张图片,以及通过edu邮箱注册可获得2倍使用量。</li><li style="text-align: left;">专业计划:每月4.99美元,适合学生、教师和STEM专业人士,包含1000页PDF和5000张图片。</li><li style="text-align: left;">团队计划:每月9.99美元,为部门、学校和公司设计,每个用户包含1000页PDF和5000张图片,共2个用户。</li><li style="text-align: left;">企业/机构计划:提供定制定价和简化注册流程,适合需要更灵活长期解决方案的组织。</li></ul>

dots.ocr
<p>dots.ocr 是小红书 hi lab 开源的多语言文档布局解析模型。模型基于 17 亿参数的视觉语言模型(VLM),能统一进行布局检测和内容识别,保持良好的阅读顺序。模型规模虽小,但性能达到业界领先水平(SOTA),在 OmniDocBench 等基准测试中表现优异,公式识别效果能与Doubao-1.5和 gemini2.5-pro 等更大规模模型相媲美,在小语种解析方面优势显著。dots.ocr 提供简洁高效的架构,任务切换仅需更改输入提示词,推理速度快,适用多种文档解析场景。</p> <h2 style="font-size: 20px;">dots.ocr的项目地址</h2> <ul> <li>GitHub仓库:https://github.com/rednote-hilab/dots.ocr</li> <li>HuggingFace模型库:https://huggingface.co/rednote-hilab/dots.ocr</li> <li>在线体验Demo:https://dotsocr.xiaohongshu.com/</li> </ul>

飞搜侠
<p>飞搜侠是专注于飞书文档搜索的高效工具,帮助用户快速找到所需的高质量飞书文档资源。具备智能搜索功能,能精准匹配用户输入的关键词,快速定位相关文档,提供一键访问链接,方便用户实时预览文档内容。飞搜侠支持移动端应用,用户可以随时随地进行搜索,适合移动办公和学习场景。热门搜索内容涵盖Prompt合集、AI工具、自媒体创业、职场技能、运营技巧和个人成长等多个领域,能满足不同用户的需求。</p> <h2 style="font-size: 20px;">飞搜侠的主要功能</h2> <ul> <li> <div class="paragraph">关键词极速搜索:用户输入标题、正文片段或作者名,可在数秒内返回匹配的飞书文档。搜索范围覆盖标题与正文字段,精度显著高于仅索引标题的方案。</div> </li> <li> <div class="paragraph">结果智能筛选:搜索结果支持按照创建时间、文档类型、分享范围等维度快速过滤,方便用户聚焦近期文件或特定格式。</div> </li> <li> <div class="paragraph">一键打开与复制:用户点击搜索结果即可直接拉起原始飞书文档,或复制链接分享给同事,无需多余跳转。</div> </li> <li> <div class="paragraph">高效搜索算法:应用侧通过关键词向飞书公共域发起爬取请求,再结合本地索引技术返回结果;体验上接近原生全局搜索,但覆盖范围更广。</div> </li> <li> <div class="paragraph">图片文字反向搜索:创新性地支持通过图片中的文字内容反向搜索相关文档,有效解决了“只记得图片内容却忘记关键词”的痛点。</div> </li> <li> <div class="paragraph">热门标签一键触发:首页预设了“Prompt合集”、“AI工具”、“自媒体”等热门标签,用户只需一键点击,可触发相关文档的搜索,大大降低了检索门槛。</div> </li> </ul> <h2 style="font-size: 20px;">如何使用飞搜侠</h2> <ul> <li> <div class="paragraph">访问平台:访问飞搜侠的官方网站: <a href="https://www.feisoo.com/" target="_blank" rel="noopener">https://www.feisoo.com/</a> </div> </li> </ul>

RAG-Anything
<p>RAG-Anything是香港大学数据智能实验室推出的开源多模态RAG系统。系统支持处理包含文本、图像、表格和公式的复杂文档,提供从文档摄取到智能查询的端到端解决方案。系统基于多模态知识图谱、灵活的解析架构和混合检索机制,显著提升复杂文档处理能力,支持多种文档格式,如PDF、Office文档、图像和文本文件等。RAG-Anything核心优势包括端到端多模态流水线、多格式文档支持、多模态内容分析引擎、知识图谱索引、灵活的处理架构和跨模态检索机制等。</p> <h2 style="font-size: 20px;">RAG-Anything的项目地址</h2> <ul> <li>GitHub仓库:https://github.com/HKUDS/RAG-Anything</li> <li>arXiv技术论文:https://arxiv.org/pdf/2410.05779</li> </ul>

Agentic Document Extraction
<div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">概述</h2> <a id="user-content-overview" class="anchor" href="https://github.com/landing-ai/agentic-doc#overview" aria-label="永久链接:概述"></a></div> <p>LandingAI Agentic 文档提取API 从视觉复杂的文档(如表格、图片和图表)中提取结构化数据,并返回具有精确元素位置的分层 JSON。</p> <p>这个 Python 库包装了该 API 以提供:</p> <ul dir="auto"> <li>长文档支持——一次调用即可处理 100 多页 PDF</li> <li>自动重试/分页——处理并发、超时和速率限制</li> <li>辅助实用程序——边界框代码片段、可视化调试器等</li> </ul> <div class="markdown-heading" dir="auto"> <h3 class="heading-element" dir="auto" tabindex="-1">特征</h3> <a id="user-content-features" class="anchor" href="https://github.com/landing-ai/agentic-doc#features" aria-label="永久链接:功能"></a></div> <ul dir="auto"> <li>📦包含电池的安装: <code>pip install agentic-doc</code> – 无需其他任何操作 → 请参阅 <a href="https://github.com/landing-ai/agentic-doc#installation">安装</a></li> <li>🗂️所有文件类型:解析任意长度的 PDF、单个图像或 URL → 请参阅 <a href="https://github.com/landing-ai/agentic-doc#supported-files">支持的文件</a></li> <li>📚长文档就绪:自动拆分和并行处理 1000 多页 PDF,然后拼接结果 → 参见 <a href="https://github.com/landing-ai/agentic-doc#parse-large-pdf-files">解析大型 PDF 文件</a></li> <li>🧩结构化输出:返回分层 JSON 以及可渲染的 Markdown → 参见 <a href="https://github.com/landing-ai/agentic-doc#result-schema">结果架构</a></li> <li>👁️真实视觉效果:可选的边界框片段和整页可视化 → 请参阅 <a href="https://github.com/landing-ai/agentic-doc#save-groundings-as-images">将 Groundings 另存为图像</a></li> <li>🏃批处理和并行:提供列表;库管理线程和速率限制(<code>BATCH_SIZE</code>,<code>MAX_WORKERS</code>)→参见 <a href="https://github.com/landing-ai/agentic-doc#parse-multiple-files-in-a-batch">批量解析多个文件</a></li> <li>🔄弹性:针对 408/429/502/503/504 和速率限制命中的指数退避重试 → 请参阅 <a href="https://github.com/landing-ai/agentic-doc#automatically-handle-api-errors-and-rate-limits-with-retries">使用重试自动处理 API 错误和速率限制</a></li> <li>🛠️嵌入式助手: <code>parse_documents</code>,,<code>parse_and_save_documents</code>→<code>parse_and_save_document</code>参见 <a href="https://github.com/landing-ai/agentic-doc#main-functions">主要功能</a></li> <li>⚙️通过 env / .env 配置:调整并行度、日志记录样式、重试上限 — 无需更改代码 → 请参阅 <a href="https://github.com/landing-ai/agentic-doc#configuration-options">配置选项</a></li> <li>🌐原始 API 就绪:高级用户仍然可以直接访问 REST 端点 → 请参阅 <a href="https://support.landing.ai/docs/document-extraction" rel="nofollow">API 文档</a></li> </ul> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">快速入门</h2> <a id="user-content-quick-start" class="anchor" href="https://github.com/landing-ai/agentic-doc#quick-start" aria-label="永久链接:快速入门"></a></div> <div class="markdown-heading" dir="auto"> <h3 class="heading-element" dir="auto" tabindex="-1">安装</h3> <a id="user-content-installation" class="anchor" href="https://github.com/landing-ai/agentic-doc#installation" aria-label="永久链接:安装"></a></div> <div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto"> <pre>pip install agentic-doc</pre> </div>

Dolphin
<p>Dolphin 是字节跳动开源的轻量级、高效的文档解析大模型。基于先解析结构后解析内容的两阶段方法,第一阶段生成文档布局元素序列,第二阶段用元素作为锚点并行解析内容。Dolphin在多种文档解析任务上表现出色,性能超越GPT-4.1、Mistral-OCR等模型。Dolphin 具有322M参数,体积小、速度快,支持多种文档元素解析,包括文本、表格、公式等。Dolphin的代码和预训练模型已公开,方便开发者使用和研究。</p> <h2 style="font-size: 20px;">Dolphin的主要功能</h2> <ul> <li>布局分析:识别文档中的各种元素(如标题、图表、表格、脚注等),按照自然阅读顺序生成元素序列。</li> <li>内容提取:将整个文档页面解析为结构化的JSON格式或Markdown格式,便于后续处理和展示。</li> <li>文本段落解析:准确识别和提取文档中的文本内容,支持多语言(如中文和英文)。</li> <li>公式识别:支持复杂公式的识别,包括行内公式和块级公式,输出LaTeX格式。</li> <li>表格解析:支持解析复杂的表格结构,提取单元格内容并生成HTML格式的表格。</li> <li>轻量级架构:模型参数量为322M,体积小,运行速度快,适合在资源受限的环境中使用。</li> <li>支持多种输入格式:支持处理多种类型的文档图像,包括学术论文、商业报告、技术文档等。</li> <li>多样化的输出格式:支持将解析结果输出为JSON、Markdown、HTML等多种格式,便于与不同系统集成。</li> </ul> <h2 style="font-size: 20px;">Dolphin的技术原理</h2> <ul> <li>页面级布局分析:用Swin Transformer对输入的文档图像进行编码,提取视觉特征。基于解码器生成文档元素序列,每个元素包含其类别(如标题、表格、图表等)和坐标位置。这一阶段的目标是按照自然阅读顺序生成结构化的布局信息。</li> <li>元素级内容解析:根据第一阶段生成的布局信息,从原始图像中裁剪出每个元素的局部视图。用特定的提示词(prompts),对每个元素进行并行内容解析。例如,表格用专门的提示词解析HTML格式,公式和文本段落共享提示词解析LaTeX格式。解码器根据裁剪后的元素图像和提示词,生成最终的解析内容。</li> </ul> <h2 style="font-size: 20px;">Dolphin的项目地址</h2> <ul> <li>GitHub仓库:<a class="external" href="https://github.com/bytedance/Dolphin" target="_blank" rel="noopener">https://github.com/bytedance/Dolphin</a></li> <li>HuggingFace模型库:<a class="external" href="https://huggingface.co/ByteDance/Dolphin" target="_blank" rel="noopener nofollow">https://huggingface.co/ByteDance/Dolphin</a></li> <li>arXiv技术论文:<a class="external" href="https://arxiv.org/pdf/2505.14059" target="_blank" rel="noopener nofollow">https://arxiv.org/pdf/2505.14059</a></li> <li>在线体验Demo:<a class="external" href="http://115.190.42.15:8888/dolphin/" target="_blank" rel="noopener nofollow">http://115.190.42.15:8888/dolphin/</a></li> </ul>

Pemo
<p>Pemo是AI驱动的文档管理工具。工具支持PDF、Epub、Word等多种格式文档的导入与管理,具备一键翻译、智能总结、思维导图生成等功能,帮助用户快速理解复杂文献,提升阅读效率。Pemo提供沉浸式阅读体验,用户自定义阅读模式、进行标注和笔记,方便记录灵感。Pemo支持文档格式转换,满足不同需求,是学生、科研人员和职场人士提升学习与工作效率的好帮手。</p> <p><img src="https://img.medsci.cn/aisite/img//vfRbHQKce6JgZFjoSVaeTxzkVmdAzNhHm1tu4Fno.png"></p> <h2 style="font-size: 20px;">Pemo的主要功能</h2> <ul> <li>导入与分类:支持PDF、Epub、Word等格式文档的导入,进行分类管理,方便查找。</li> <li>格式转换:将不同格式的文档相互转换,如PDF转Word、Epub转PDF等,满足多样化的阅读和编辑需求。</li> <li>AI翻译:实时翻译外文文档,帮助用户无障碍阅读多语言内容。</li> <li>语音朗读:将书籍和文献转换为语音,用户能随时随地收听。</li> <li>智能总结:AI自动生成文献摘要,帮助用户快速掌握核心内容,节省时间。</li> <li>思维导图:将复杂文献转化为直观的思维导图,助力理解和记忆。</li> <li>智能笔记:阅读时轻松做笔记,AI自动关联相关内容,提高学习效率。</li> <li>文档注释:为电子书和PDF文档添加高亮、笔记和书签,增强阅读体验。</li> </ul> <h2 style="font-size: 20px;">Pemo的官网地址</h2> <ul> <li>官网地址:<a href="https://pemo.ai/" target="_blank" rel="noopener">pemo.ai</a></li> </ul>

ContextGem
<p dir="auto"><a href="https://camo.githubusercontent.com/dcc762e8d3dc538b9e7dffbc07f3a0b3bfae2e4b56c8d5670075d156cd5d53b6/68747470733a2f2f636f6e7465787467656d2e6465762f5f7374617469632f636f6e7465787467656d5f726561646d655f6865616465722e706e67" target="_blank" rel="noopener noreferrer nofollow"><img title="ContextGem - 轻松从文档中提取 LLM" src="https://camo.githubusercontent.com/dcc762e8d3dc538b9e7dffbc07f3a0b3bfae2e4b56c8d5670075d156cd5d53b6/68747470733a2f2f636f6e7465787467656d2e6465762f5f7374617469632f636f6e7465787467656d5f726561646d655f6865616465722e706e67" alt="ContextGem" data-canonical-src="https://contextgem.dev/_static/contextgem_readme_header.png"></a></p> <div class="markdown-heading" dir="auto"> <h1 class="heading-element" dir="auto" tabindex="-1">ContextGem:轻松从文档中提取 LLM</h1> <a id="user-content-contextgem-effortless-llm-extraction-from-documents" class="anchor" href="https://github.com/shcherbak-ai/contextgem#contextgem-effortless-llm-extraction-from-documents" aria-label="永久链接:ContextGem:轻松从文档中提取 LLM"></a></div> <p dir="auto"><a href="https://github.com/shcherbak-ai/contextgem/actions/workflows/ci-tests.yml"><img src="https://github.com/shcherbak-ai/contextgem/actions/workflows/ci-tests.yml/badge.svg?branch=main" alt="测试"></a>&nbsp;<a href="https://github.com/shcherbak-ai/contextgem/actions"><img src="https://camo.githubusercontent.com/347c395f771dc077b9f35a3e297c18ae2bdec42178f6a8b86f301bbd237109a9/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f676973742e67697468756275736572636f6e74656e742e636f6d2f53657267696953686368657262616b2f64616165653030653164666666376132396361313061393232656333626563642f7261772f636f7665726167652e6a736f6e" alt="覆盖范围" data-canonical-src="https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/SergiiShcherbak/daaee00e1dfff7a29ca10a922ec3becd/raw/coverage.json"></a>&nbsp;<a href="https://github.com/shcherbak-ai/contextgem/actions/workflows/docs.yml"><img src="https://github.com/shcherbak-ai/contextgem/actions/workflows/docs.yml/badge.svg?branch=main" alt="文档"></a>&nbsp;<a href="https://shcherbak-ai.github.io/contextgem/" rel="nofollow"><img src="https://camo.githubusercontent.com/c21574baf34c81b3651a5274c8b074471bc1e142a9516c0cfa75dea9223d93fe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f646f63732d6c61746573742d626c75652e737667" alt="文档" data-canonical-src="https://img.shields.io/badge/docs-latest-blue.svg"></a>&nbsp;<a href="https://opensource.org/licenses/Apache-2.0" rel="nofollow"><img src="https://camo.githubusercontent.com/8c0b445c03bb9f023baece6d7b7062fbc1c09274e7adac502b2e3d97c8f3f4f8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4170616368655f322e302d6272696768742e737667" alt="执照" data-canonical-src="https://img.shields.io/badge/License-Apache_2.0-bright.svg"></a>&nbsp;<a href="https://camo.githubusercontent.com/667c17247b46abb2e3bb36a43080282a98b383bfda63a45a0d85d76cdaaf554e/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f636f6e7465787467656d" target="_blank" rel="noopener noreferrer nofollow"><img src="https://camo.githubusercontent.com/667c17247b46abb2e3bb36a43080282a98b383bfda63a45a0d85d76cdaaf554e/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f636f6e7465787467656d" alt="PyPI" data-canonical-src="https://img.shields.io/pypi/v/contextgem"></a>&nbsp;<a href="https://www.python.org/downloads/" rel="nofollow"><img src="https://camo.githubusercontent.com/7ff91fc79dec5b71b1dfa1c53d99c5688a036ec8a95fe7b366a0644d662f45cf/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f707974686f6e2d332e3130253230253743253230332e3131253230253743253230332e3132253230253743253230332e31332d626c7565" alt="Python 版本" data-canonical-src="https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue"></a>&nbsp;<a href="https://github.com/shcherbak-ai/contextgem/actions/workflows/codeql.yml"><img src="https://github.com/shcherbak-ai/contextgem/actions/workflows/codeql.yml/badge.svg?branch=main" alt="代码安全"></a>&nbsp;<a href="https://github.com/psf/black"><img src="https://camo.githubusercontent.com/5bf9e9fa18966df7cb5fac7715bef6b72df15e01a6efa9d616c83f9fcb527fe2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f64652532307374796c652d626c61636b2d3030303030302e737667" alt="代码样式:黑色" data-canonical-src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>&nbsp;<a href="https://pycqa.github.io/isort/" rel="nofollow"><img src="https://camo.githubusercontent.com/67699ff1c668c9b011f6854466a11c31c6551c1055736bc3e26536c1c52d089f/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f253230696d706f7274732d69736f72742d2532333136373462313f7374796c653d666c6174" alt="进口:isort" data-canonical-src="https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat"></a>&nbsp;<a href="https://pydantic.dev/" rel="nofollow"><img src="https://camo.githubusercontent.com/1ec3b5f774c66556456b4b855a73c1706f5454fa0ac3d2e4bcdabda9153b6b45/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f707964616e7469632f707964616e7469632f6d61696e2f646f63732f62616467652f76322e6a736f6e" alt="Pydantic v2" data-canonical-src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json"></a>&nbsp;<a href="https://python-poetry.org/" rel="nofollow"><img src="https://camo.githubusercontent.com/e9de59b7d2a7896f05d977ca76c28c69c6ff163840e5526baeb18e56c532ad5f/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e743f75726c3d68747470733a2f2f707974686f6e2d706f657472792e6f72672f62616467652f76302e6a736f6e" alt="诗" data-canonical-src="https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json"></a>&nbsp;<a href="https://github.com/pre-commit/pre-commit"><img src="https://camo.githubusercontent.com/3f29c595a2e15caa8e0729b41d0451353076f480eaeefb9b07ba68e20cccb7b2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7072652d2d636f6d6d69742d656e61626c65642d626c75653f6c6f676f3d7072652d636f6d6d6974266c6f676f436f6c6f723d7768697465" alt="预先提交" data-canonical-src="https://img.shields.io/badge/pre--commit-enabled-blue?logo=pre-commit&amp;logoColor=white"></a>&nbsp;<a href="https://github.com/shcherbak-ai/contextgem/blob/main/CODE_OF_CONDUCT.md"><img src="https://camo.githubusercontent.com/71217453f48cd1f12ba5a720412bb92743010653a5cc69654e627fd99e2e9104/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d322e312d3462616161612e737667" alt="贡献者契约" data-canonical-src="https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg"></a>&nbsp;<a href="https://deepwiki.com/shcherbak-ai/contextgem" rel="nofollow"><img src="https://camo.githubusercontent.com/f8d782705bcb9ce83b5e04fc86504e194e5226b10cee5e6e9794cfc8b6101cba/68747470733a2f2f696d672e736869656c64732e696f2f7374617469632f76313f6c6162656c3d4465657057696b69266d6573736167653d4368617425323077697468253230436f6465266c6162656c436f6c6f723d25323332383335393326636f6c6f723d253233374535374332267374796c653d666c61742d737175617265" alt="深度维基" data-canonical-src="https://img.shields.io/static/v1?label=DeepWiki&amp;message=Chat%20with%20Code&amp;labelColor=%23283593&amp;color=%237E57C2&amp;style=flat-square"></a></p> <p dir="auto">&nbsp;</p> <p dir="auto">ContextGem 是一个免费的开源 LLM 框架,它可以让您以最少的代码更轻松地从文档中提取结构化数据和见解。</p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">💎 为什么选择 ContextGem?</h2> <a id="user-content--why-contextgem" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-why-contextgem" aria-label="永久链接:💎 为什么选择 ContextGem?"></a></div> <p dir="auto">大多数流行的 LLM 框架用于从文档中提取结构化数据,即使是提取基本信息,也需要大量的样板代码。这大大增加了开发时间和复杂性。</p> <p dir="auto">ContextGem 通过提供灵活直观的框架来应对这一挑战,该框架能够以最小的投入从文档中提取结构化数据和洞察。复杂且耗时的部分由<strong>强大的抽象功能</strong>处理,从而消除了样板代码并降低了开发成本。</p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">⭐ 主要特点</h2> <a id="user-content--key-features" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-key-features" aria-label="永久链接:⭐ 主要特点"></a></div> <table> <thead> <tr> <th>内置抽象</th> <th><strong>ContextGem</strong></th> <th>其他 LLM 框架*</th> </tr> </thead> <tbody> <tr> <td>自动动态提示</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>自动化数据建模和验证器</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>精确的粒度参考映射(段落和句子)</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>理由(提取背后的推理)</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>神经分割(SaT)</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>多语言支持(无提示输入/输出)</td> <td>🟢</td> <td>◯</td> </tr> <tr> <td>单一、统一的提取管道(声明式、可重用、完全可序列化)</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>分组法学硕士课程,包含特定角色的任务</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>嵌套上下文提取</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>统一的、完全可序列化的结果存储模型(文档)</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>提取任务校准示例</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>内置并发 I/O 处理</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>自动使用和成本跟踪</td> <td>🟢</td> <td>🟡</td> </tr> <tr> <td>回退和重试逻辑</td> <td>🟢</td> <td>🟢</td> </tr> <tr> <td>多家 LLM 提供商</td> <td>🟢</td> <td>🟢</td> </tr> </tbody> </table> <p dir="auto">🟢 - 完全支持 - 无需额外设置<br>🟡 - 部分支持 - 需要额外设置<br>◯ - 不支持 - 需要自定义逻辑</p> <p dir="auto">* 查看ContextGem 抽象的<a href="https://contextgem.dev/motivation.html#the-contextgem-solution" rel="nofollow">描述</a>以及使用 ContextGem 和其他流行的开源 LLM 框架的具体实现示例的<a href="https://contextgem.dev/vs_other_frameworks.html" rel="nofollow">比较。</a></p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">💡 使用<strong>最少的代码</strong>,您可以:</h2> <a id="user-content--with-minimal-code-you-can" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-with-minimal-code-you-can" aria-label="永久链接:💡 使用最少的代码,您可以:"></a></div> <ul dir="auto"> <li>从文档(文本、图像)中<strong>提取结构化数据</strong></li> <li><strong>识别并分析</strong>文档中的关键方面(主题、主题、类别)</li> <li><strong>从文档中提取特定概念</strong>(实体、事实、结论、评估)</li> <li>通过简单、直观的 API<strong>构建复杂的提取工作流程</strong></li> <li><strong>创建多级提取管道</strong>(包含概念的方面、分层方面)</li> </ul> <p>&nbsp;</p> <p dir="auto"><a href="https://camo.githubusercontent.com/84c9fdd0aa6c0023582ec31ee75d304e1fc63abc15882a8092514ad4190ea616/68747470733a2f2f636f6e7465787467656d2e6465762f5f7374617469632f726561646d655f636f64655f736e69707065742e706e67" target="_blank" rel="noopener noreferrer nofollow"><img title="ContextGem 提取示例" src="https://camo.githubusercontent.com/84c9fdd0aa6c0023582ec31ee75d304e1fc63abc15882a8092514ad4190ea616/68747470733a2f2f636f6e7465787467656d2e6465762f5f7374617469632f726561646d655f636f64655f736e69707065742e706e67" alt="ContextGem 提取示例" data-canonical-src="https://contextgem.dev/_static/readme_code_snippet.png"></a></p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">📦安装</h2> <a id="user-content--installation" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-installation" aria-label="固定链接:📦安装"></a></div> <div class="highlight highlight-source-shell notranslate position-relative overflow-auto" dir="auto"> <pre>pip install -U contextgem</pre> <div class="zeroclipboard-container">&nbsp;</div> </div> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">🚀 快速入门</h2> <a id="user-content--quick-start" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-quick-start" aria-label="永久链接:🚀 快速入门"></a></div> <div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto"> <pre><span class="pl-c"># Quick Start Example - Extracting anomalies from a document, with source references and justifications</span> <span class="pl-k">import</span> <span class="pl-s1">os</span> <span class="pl-k">from</span> <span class="pl-s1">contextgem</span> <span class="pl-k">import</span> <span class="pl-v">Document</span>, <span class="pl-v">DocumentLLM</span>, <span class="pl-v">StringConcept</span> <span class="pl-c"># Sample document text (shortened for brevity)</span> <span class="pl-s1">doc</span> <span class="pl-c1">=</span> <span class="pl-en">Document</span>( <span class="pl-s1">raw_text</span><span class="pl-c1">=</span>( <span class="pl-s">"Consultancy Agreement<span class="pl-cce">\n</span>"</span> <span class="pl-s">"This agreement between Company A (Supplier) and Company B (Customer)...<span class="pl-cce">\n</span>"</span> <span class="pl-s">"The term of the agreement is 1 year from the Effective Date...<span class="pl-cce">\n</span>"</span> <span class="pl-s">"The Supplier shall provide consultancy services as described in Annex 2...<span class="pl-cce">\n</span>"</span> <span class="pl-s">"The Customer shall pay the Supplier within 30 calendar days of receiving an invoice...<span class="pl-cce">\n</span>"</span> <span class="pl-s">"The purple elephant danced gracefully on the moon while eating ice cream.<span class="pl-cce">\n</span>"</span> <span class="pl-c"># 💎 anomaly</span> <span class="pl-s">"This agreement is governed by the laws of Norway...<span class="pl-cce">\n</span>"</span> ), ) <span class="pl-c"># Attach a document-level concept</span> <span class="pl-s1">doc</span>.<span class="pl-c1">concepts</span> <span class="pl-c1">=</span> [ <span class="pl-en">StringConcept</span>( <span class="pl-s1">name</span><span class="pl-c1">=</span><span class="pl-s">"Anomalies"</span>, <span class="pl-c"># in longer contexts, this concept is hard to capture with RAG</span> <span class="pl-s1">description</span><span class="pl-c1">=</span><span class="pl-s">"Anomalies in the document"</span>, <span class="pl-s1">add_references</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">reference_depth</span><span class="pl-c1">=</span><span class="pl-s">"sentences"</span>, <span class="pl-s1">add_justifications</span><span class="pl-c1">=</span><span class="pl-c1">True</span>, <span class="pl-s1">justification_depth</span><span class="pl-c1">=</span><span class="pl-s">"brief"</span>, <span class="pl-c"># see the docs for more configuration options</span> ) <span class="pl-c"># add more concepts to the document, if needed</span> <span class="pl-c"># see the docs for available concepts: StringConcept, JsonObjectConcept, etc.</span> ] <span class="pl-c"># Or use `doc.add_concepts([...])`</span> <span class="pl-c"># Define an LLM for extracting information from the document</span> <span class="pl-s1">llm</span> <span class="pl-c1">=</span> <span class="pl-en">DocumentLLM</span>( <span class="pl-s1">model</span><span class="pl-c1">=</span><span class="pl-s">"openai/gpt-4o-mini"</span>, <span class="pl-c"># or another provider/LLM</span> <span class="pl-s1">api_key</span><span class="pl-c1">=</span><span class="pl-s1">os</span>.<span class="pl-c1">environ</span>.<span class="pl-c1">get</span>( <span class="pl-s">"CONTEXTGEM_OPENAI_API_KEY"</span> ), <span class="pl-c"># your API key for the LLM provider</span> <span class="pl-c"># see the docs for more configuration options</span> ) <span class="pl-c"># Extract information from the document</span> <span class="pl-s1">doc</span> <span class="pl-c1">=</span> <span class="pl-s1">llm</span>.<span class="pl-c1">extract_all</span>(<span class="pl-s1">doc</span>) <span class="pl-c"># or use async version `await llm.extract_all_async(doc)`</span> <span class="pl-c"># Access extracted information in the document object</span> <span class="pl-en">print</span>( <span class="pl-s1">doc</span>.<span class="pl-c1">concepts</span>[<span class="pl-c1">0</span>].<span class="pl-c1">extracted_items</span> ) <span class="pl-c"># extracted items with references &amp; justifications</span> <span class="pl-c"># or `doc.get_concept_by_name("Anomalies").extracted_items`</span></pre> <div class="zeroclipboard-container">&nbsp;</div> </div> <p dir="auto"><a href="https://colab.research.google.com/github/shcherbak-ai/contextgem/blob/main/dev/notebooks/readme/quickstart_concept.ipynb" rel="nofollow"><img src="https://camo.githubusercontent.com/96889048f8a9014fdeba2a891f97150c6aac6e723f5190236b10215a97ed41f3/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="在 Colab 中打开" data-canonical-src="https://colab.research.google.com/assets/colab-badge.svg"></a></p> <hr> <p dir="auto">请参阅文档中的更多示例:</p> <div class="markdown-heading" dir="auto"> <h3 class="heading-element" dir="auto" tabindex="-1">基本使用示例</h3> <a id="user-content-basic-usage-examples" class="anchor" href="https://github.com/shcherbak-ai/contextgem#basic-usage-examples" aria-label="永久链接:基本用法示例"></a></div> <ul dir="auto"> <li><a href="https://contextgem.dev/quickstart.html#aspect-extraction-from-document" rel="nofollow">从文档中提取方面</a></li> <li><a href="https://contextgem.dev/quickstart.html#extracting-aspect-with-sub-aspects" rel="nofollow">使用子方面提取方面</a></li> <li><a href="https://contextgem.dev/quickstart.html#concept-extraction-from-aspect" rel="nofollow">从方面提取概念</a></li> <li><a href="https://contextgem.dev/quickstart.html#concept-extraction-from-document-text" rel="nofollow">从文档(文本)中提取概念</a></li> <li><a href="https://contextgem.dev/quickstart.html#concept-extraction-from-document-vision" rel="nofollow">从文档中提取概念(视觉)</a></li> <li><a href="https://contextgem.dev/quickstart.html#lightweight-llm-chat-interface" rel="nofollow">LLM聊天界面</a></li> </ul> <div class="markdown-heading" dir="auto"> <h3 class="heading-element" dir="auto" tabindex="-1">高级用法示例</h3> <a id="user-content-advanced-usage-examples" class="anchor" href="https://github.com/shcherbak-ai/contextgem#advanced-usage-examples" aria-label="永久链接:高级用法示例"></a></div> <ul dir="auto"> <li><a href="https://contextgem.dev/advanced_usage.html#extracting-aspects-with-concepts" rel="nofollow">提取包含概念的方面</a></li> <li><a href="https://contextgem.dev/advanced_usage.html#extracting-aspects-and-concepts-from-a-document" rel="nofollow">从文档中提取方面和概念</a></li> <li><a href="https://contextgem.dev/advanced_usage.html#using-a-multi-llm-pipeline-to-extract-data-from-several-documents" rel="nofollow">使用多 LLM 管道从多个文档中提取数据</a></li> </ul> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">🔄 文档转换器</h2> <a id="user-content--document-converters" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-document-converters" aria-label="永久链接:🔄 文档转换器"></a></div> <p dir="auto">要创建用于 LLM 分析的 ContextGem 文档,您可以直接传递原始文本,也可以使用处理各种文件格式的内置转换器。</p> <div class="markdown-heading" dir="auto"> <h3 class="heading-element" dir="auto" tabindex="-1">📄 DOCX 转换器</h3> <a id="user-content--docx-converter" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-docx-converter" aria-label="永久链接:📄 DOCX 转换器"></a></div> <p dir="auto">ContextGem 提供内置转换器,可轻松将 DOCX 文件转换为 LLM 就绪数据。</p> <ul dir="auto"> <li>提取其他开源工具通常无法捕获的信息:未对齐的表格、注释、脚注、文本框、页眉/页脚和嵌入图像</li> <li>保留具有丰富元数据的文档结构,以改进 LLM 分析</li> </ul> <div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto"> <pre><span class="pl-c"># Using ContextGem's DocxConverter</span> <span class="pl-k">from</span> <span class="pl-s1">contextgem</span> <span class="pl-k">import</span> <span class="pl-v">DocxConverter</span> <span class="pl-s1">converter</span> <span class="pl-c1">=</span> <span class="pl-en">DocxConverter</span>() <span class="pl-c"># Convert a DOCX file to an LLM-ready ContextGem Document</span> <span class="pl-c"># from path</span> <span class="pl-s1">document</span> <span class="pl-c1">=</span> <span class="pl-s1">converter</span>.<span class="pl-c1">convert</span>(<span class="pl-s">"path/to/document.docx"</span>) <span class="pl-c"># or from file object</span> <span class="pl-k">with</span> <span class="pl-en">open</span>(<span class="pl-s">"path/to/document.docx"</span>, <span class="pl-s">"rb"</span>) <span class="pl-k">as</span> <span class="pl-s1">docx_file_object</span>: <span class="pl-s1">document</span> <span class="pl-c1">=</span> <span class="pl-s1">converter</span>.<span class="pl-c1">convert</span>(<span class="pl-s1">docx_file_object</span>) <span class="pl-c"># You can also use it as a standalone text extractor</span> <span class="pl-s1">docx_text</span> <span class="pl-c1">=</span> <span class="pl-s1">converter</span>.<span class="pl-c1">convert_to_text_format</span>( <span class="pl-s">"path/to/document.docx"</span>, <span class="pl-s1">output_format</span><span class="pl-c1">=</span><span class="pl-s">"markdown"</span>, <span class="pl-c"># or "raw"</span> )</pre> <div class="zeroclipboard-container">&nbsp;</div> </div> <p dir="auto">在文档中了解有关<a href="https://contextgem.dev/converters/docx.html" rel="nofollow">DOCX 转换器功能的更多信息。</a></p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">🎯 重点文档分析</h2> <a id="user-content--focused-document-analysis" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-focused-document-analysis" aria-label="永久链接:🎯 重点文档分析"></a></div> <p dir="auto">ContextGem 利用 LLM 的长上下文窗口,从单个文档中提取出卓越的准确率。与 RAG 方法(通常<a href="https://www.linkedin.com/pulse/raging-contracts-pitfalls-rag-contract-review-shcherbak-ai-ptg3f" rel="nofollow">难以处理复杂概念和细微洞察)</a>不同,ContextGem 充分利用了<a href="https://arxiv.org/abs/2502.12962" rel="nofollow">持续扩展的上下文容量</a>、不断改进的 LLM 功能以及降低的成本。这种专注的方法能够直接从完整文档中提取信息,消除检索不一致,同时针对深入的单文档分析进行优化。虽然这可以提高单个文档的准确率,但 ContextGem 目前不支持跨文档查询或全语料库检索&mdash;&mdash;对于这些用例,现代 RAG 系统(例如 LlamaIndex、Haystack)仍然更为合适。</p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">🤖 支持</h2> <a id="user-content--supported-llms" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-supported-llms" aria-label="永久链接:🤖 支持的法学硕士"></a></div> <p dir="auto"><a href="https://github.com/BerriAI/litellm">ContextGem 通过LiteLLM</a>集成支持基于云和本地的 LLM :</p> <ul dir="auto"> <li><strong>云端法学硕士</strong>:OpenAI、Anthropic、Google、Azure OpenAI 等</li> <li><strong>本地 LLM</strong>:使用 Ollama、LM Studio 等提供商在本地运行模型。</li> <li><strong>模型架构</strong>:适用于推理/CoT 功能(例如 o4-mini)和非推理模型(例如 gpt-4.1)</li> <li><strong>简单的 API</strong>:所有 LLM 的统一接口,可轻松切换提供商</li> </ul> <p dir="auto">在文档中了解<a href="https://contextgem.dev/llms/supported_llms.html" rel="nofollow">有关支持的 LLM 提供程序和模型</a>以及如何<a href="https://contextgem.dev/llms/llm_config.html" rel="nofollow">配置 LLM 的更多信息。</a></p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">⚡ 优化</h2> <a id="user-content--optimizations" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-optimizations" aria-label="永久链接:⚡ 优化"></a></div> <p dir="auto">ContextGem 文档提供了有关优化策略的指导,以最大限度地提高性能、最大限度地降低成本并提高提取准确性:</p> <ul dir="auto"> <li><a href="https://contextgem.dev/optimizations/optimization_accuracy.html" rel="nofollow">优化准确性</a></li> <li><a href="https://contextgem.dev/optimizations/optimization_speed.html" rel="nofollow">优化速度</a></li> <li><a href="https://contextgem.dev/optimizations/optimization_cost.html" rel="nofollow">优化成本</a></li> <li><a href="https://contextgem.dev/optimizations/optimization_long_docs.html" rel="nofollow">处理长文档</a></li> <li><a href="https://contextgem.dev/optimizations/optimization_choosing_llm.html" rel="nofollow">选择合适的法学硕士</a></li> </ul> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">💾 序列化结果</h2> <a id="user-content--serializing-results" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-serializing-results" aria-label="永久链接:💾 序列化结果"></a></div> <p dir="auto">ContextGem 允许您使用内置序列化方法保存和加载 Document 对象、管道和 LLM 配置:</p> <ul dir="auto"> <li>保存已处理的文档以避免重复昂贵的 LLM 调用</li> <li>在系统之间传输提取结果</li> <li>保留管道和 LLM 配置以供以后重用</li> </ul> <p dir="auto">在文档中了解有关<a href="https://contextgem.dev/serialization.html" rel="nofollow">序列化选项的更多信息。</a></p> <div class="markdown-heading" dir="auto"> <h2 class="heading-element" dir="auto" tabindex="-1">📚 文档</h2> <a id="user-content--documentation" class="anchor" href="https://github.com/shcherbak-ai/contextgem#-documentation" aria-label="永久链接:📚 文档"></a></div> <p dir="auto">完整文档可在<a href="https://contextgem.dev/" rel="nofollow">contextgem.dev</a>上找到。</p> <p dir="auto">完整文档的原始文本版本可在 处获取<a href="https://github.com/shcherbak-ai/contextgem/blob/main/docs/docs-raw-for-llm.txt"><code>docs/docs-raw-for-llm.txt</code></a>。此文件自动生成,包含所有文档,其格式已针对 LLM 导入进行了优化(例如,用于问答)。</p>

暴躁的教授读论文(mad-professor)
"暴躁教授读论文"是一个学术论文阅读伴侣应用程序,旨在通过富有个性的AI助手提高论文阅读效率。它集成了PDF处理、AI翻译、RAG检索、AI问答和语音交互等多种功能,为学术研究者提供一站式的论文阅读解决方案。 主要特性 论文自动处理:导入PDF后自动提取、翻译和结构化论文内容 双语显示:支持中英文对照阅读论文 AI智能问答:与论文内容结合,提供专业的解释和分析 个性化AI教授:AI以"暴躁教授"的个性回答问题,增加趣味性 语音交互:支持语音提问和TTS语音回答 RAG增强检索:基于论文内容的精准检索和定位 分屏界面:左侧论文内容,右侧AI问答,高效交互 技术架构 前端界面:PyQt6构建的现代化桌面应用 核心引擎: AI问答模块:基于LLM的学术问答系统 RAG检索系统:向量检索增强的问答精准度 论文处理管线:PDF转MD、自动翻译、结构化解析 交互系统: 语音识别:实时语音输入识别 TTS语音合成:AI回答实时播报 情感识别:根据问题内容调整回答情绪 安装指南 环境要求 Python 3.10或更高版本 CUDA支持 6GB 以上显存