pdfmux — MCP Server by NameetP

by NameetP · MCP Server · ★ 71

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About pdfmux

pdfmux Self-healing PDF extraction with per-page confidence scoring. Open-source LlamaParse alternative for RAG pipelines, MCP server for Claude Desktop, LangChain + LlamaIndex loaders. Ranked #2 on opendataloader-bench (0.900). The only PDF extractor that audits its own output. Catches blank pages, scrambled columns, broken tables — re-extracts them with a stronger backend. So your LLM gets clean data, not silent garbage. Routes each page to the best of 5 rule-based backends + BYOK LLM fallback (Gemini / Claude / GPT-4o / Ollama). One CLI. One API. Zero config. PDF ── pdfmux router ── best extractor per page ── audit ── re-extract failures ── Markdown / JSON / chunks | ├─ PyMuPDF (digital text, 0.01s/page) ├─ OpenDataLoader (complex layouts, 0.05s/page) ├─ RapidOCR (scanned pages, CPU-only)

ai-agentdoclingdocument-parsingllmmcpocropendataloaderpdfpdf-extractionpdf-to-json

Quick Facts

Stars71
Forks11
LanguagePython
CategoryMCP Server
LicenseMIT
Quality Score67.8650315813379/100
Open Issues5
Last Updated2026-06-24
Created2026-03-03
Platformscli, mcp, python
Est. Tokens~19k

pdfmux alternative? Top 6 similar tools

Looking for a pdfmux alternative? If you're comparing pdfmux with other mcp server tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • pdf-mcp by jztan · ⭐ 69

    An MCP server that lets Claude Code and other AI agents work through large PDFs without overflowing their cont

  • MinerU-Skill by Nebutra · ⭐ 52

    AI-Native document parser: PDF, Office & images → clean Markdown with LaTeX, tables & OCR. Zero-dependency CLI

  • DocSentinel by arthurpanhku · ⭐ 89

    MCP server for AI agent for cybersecurity: automate assessment of documents, questionnaires & reports. Multi-f

  • utcp-mcp by universal-tool-calling-protocol · ⭐ 196

    All-in-one MCP server that can connect your AI agents to any native endpoint, powered by UTCP

  • anansi by mdowis · ⭐ 94

    A self-healing web scraper built for hostile sites: selectors repair themselves, browser rendering kicks in wh

  • seerai by dralkh · ⭐ 62

    AI Research assistant plugin for Zotero 9. Chat with your library, run federated scholarly search, RAG, OCR, s

More MCP Server Tools

Explore other popular mcp server tools:

View all MCP Server tools →

Popular Python Agent Tools

Frequently Asked Questions

What is pdfmux?

pdfmux is PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.. It is categorized as a MCP Server with 71 GitHub stars.

What programming language is pdfmux written in?

pdfmux is primarily written in Python. It covers topics such as ai-agent, docling, document-parsing.

How do I install or use pdfmux?

You can find installation instructions and usage details in the pdfmux GitHub repository at github.com/NameetP/pdfmux. The project has 71 stars and 11 forks, indicating an active community.

What license does pdfmux use?

pdfmux is released under the MIT license, making it free to use and modify according to the license terms.

What are the best alternatives to pdfmux?

The top alternatives to pdfmux on Agent Skills Hub include pdf-mcp, MinerU-Skill, DocSentinel. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse MCP Server tools