An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "pdf-processing" keyword

visual-rag-toolkit 0.2.0
End-to-end visual document retrieval with ColPali, featuring two-stage pooling for scalable search
7 versions - Latest release: 26 days ago - 323 downloads last month - 1 stars on GitHub - 1 maintainer
pdftl 0.11.1
A capable CLI tool for PDF manipulation inspired by pdftk.
17 versions - Latest release: 27 days ago - 499 downloads last month - 1 stars on GitHub - 1 maintainer
journal-vetter 1.0.1
uses tokenized & carefully summarized journals to query an LLM for analysis, based also on user-d...
2 versions - Latest release: 7 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
mcp-pdf 2.0.14
Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, ...
20 versions - Latest release: 19 days ago - 348 downloads last month - 0 stars on GitHub - 1 maintainer
pdfmcp-tools 0.1.1
MCP server for comprehensive PDF processing with 18 specialized tools
2 versions - Latest release: 6 months ago - 11 downloads last month - 1 stars on GitHub - 1 maintainer
pdf-snip 0.0.3
A package to help manage pdf pages, images and their conversions during different NLP, CV or othe...
2 versions - Latest release: about 1 year ago - 17 downloads last month - 3 stars on GitHub - 1 maintainer
nutrient-dws 3.0.0
Python client library for Nutrient Document Web Services API
4 versions - Latest release: 21 days ago - 128 downloads last month - 54 stars on GitHub - 1 maintainer
flockparser 1.0.9
Distributed document RAG system with intelligent GPU/CPU orchestration
8 versions - Latest release: 4 months ago - 96 downloads last month - 3 stars on GitHub - 1 maintainer
vlense 0.1.4
A Python package to extract text from images and PDFs using Vision Language Model (VLM).
5 versions - Latest release: over 1 year ago - 34 downloads last month - 1 stars on GitHub - 1 maintainer
peslac 0.1.4
A Python package for the Peslac API
5 versions - Latest release: about 1 year ago - 18 downloads last month - 0 stars on GitHub - 1 maintainer
preocr 1.4.0
A fast, layout-aware OCR decision engine for document processing pipelines. Detects whether files...
26 versions - Latest release: 23 days ago - 1 maintainer
aikitx 1.0.0
A comprehensive GUI toolkit for Large Language Models (LLMs) with GGUF support, document processi...
1 version - Latest release: 8 months ago - 56 downloads last month - 0 stars on GitHub - 1 maintainer
papermage 0.20.0
Papermage. Casting magic over scientific PDFs.
8 versions - Latest release: almost 2 years ago - 158 downloads last month - 786 stars on GitHub - 3 maintainers
docling-extractor 1.0.0
Production-grade document extraction with intelligent fallback chain: Docling -> PyMuPDF -> pdfpl...
1 version - Latest release: 2 months ago - 1 maintainer
fileseek 0.1.3
FileSeek – AI-Powered Local Document Archive&Search
3 versions - Latest release: about 1 year ago - 23 downloads last month - 1 maintainer