An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "document-processing" keyword

View the packages on the pypi.org package registry that are tagged with the "document-processing" keyword.

ai-chunking 0.1.9
A powerful Python library for semantic document chunking and enrichment using AI
8 versions - Latest release: about 1 month ago - 1.45 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pyrhubarb 0.0.6
A Python framework for multi-modal document understanding with generative AI
6 versions - Latest release: 4 days ago - 43.9 thousand downloads last month - 81 stars on GitHub - 1 maintainer
qdrant-loader 0.1.1
A tool for collecting and vectorizing technical content from multiple sources and storing it in a...
4 versions - Latest release: 10 days ago - 285 downloads last month - 2 stars on GitHub - 1 maintainer
kiss-ai-stack-server 0.1.0a17
KISS AI Stack's Server stub - Simplify AI Agent Development
18 versions - Latest release: 4 months ago - 608 downloads last month - 1 stars on GitHub - 1 maintainer
magicconvert 0.1.0
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...
2 versions - Latest release: 2 months ago - 106 downloads last month - 1 stars on GitHub - 1 maintainer
kreuzberg 3.1.3
A text extraction library supporting PDFs, images, office documents and more
19 versions - Latest release: 9 days ago - 6.53 thousand downloads last month - 1,736 stars on GitHub - 1 maintainer
fileseek 0.1.3
FileSeek – AI-Powered Local Document Archive&Search
3 versions - Latest release: 2 months ago - 505 downloads last month - 1 maintainer
contextgem 0.1.1
Easier and faster way to build LLM extraction workflows through powerful abstractions
3 versions - Latest release: 12 days ago - 338 downloads last month - 35 stars on GitHub - 1 maintainer
bytemesumai 0.1.1
Building Blocks for Robust and Context-Aware Retrieval-Augmented Generation
2 versions - Latest release: about 1 month ago - 194 downloads last month - 1 stars on GitHub - 1 maintainer
raggy 0.2.6 💰
scraping stuff
18 versions - Latest release: 4 months ago - 1 dependent package - 400 downloads last month - 18 stars on GitHub - 1 maintainer
kiss-ai-stack-core 0.1.0
KISS AI Stack's RAG builder core
26 versions - Latest release: 5 months ago - 757 downloads last month - 1 stars on GitHub - 1 maintainer
cucaracha 0.6.0 💰
Mr. Franz Cucaracha will be glad to assist you to the document analysis and processing routine
6 versions - Latest release: 3 months ago - 212 downloads last month - 1 stars on GitHub - 1 maintainer
kiss-ai-stack-types 0.1.0a4
KISS AI Stack's common object types
4 versions - Latest release: 4 months ago - 141 downloads last month - 0 stars on GitHub - 1 maintainer
kiss-ai-stack-client 0.1.0a2
KISS AI Stack's Python Client SDK - Simplify AI Agent Development
3 versions - Latest release: 4 months ago - 119 downloads last month - 1 stars on GitHub - 1 maintainer
indoxminer 0.1.5
Indox Data Extraction
19 versions - Latest release: 2 months ago - 632 downloads last month - 18 stars on GitHub - 2 maintainers
atai-pdf-tool 0.1.1
A tool for parsing and extracting text from PDF files with OCR capabilities
5 versions - Latest release: about 2 months ago - 277 downloads last month - 0 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.
6 versions - Latest release: 3 months ago - 214 downloads last month - 1 stars on GitHub - 1 maintainer
markdrop 0.3.1
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered fe...
19 versions - Latest release: 3 months ago - 886 downloads last month - 84 stars on GitHub - 1 maintainer
llamasearch-pdf-llamasearch 0.1.0
A comprehensive PDF processing toolkit for document workflows
1 version - Latest release: 14 days ago
smart-llm-loader 0.1.0
A powerful PDF processing toolkit that seamlessly integrates with LLMs for intelligent document c...
1 version - Latest release: 2 months ago - 78 downloads last month - 54 stars on GitHub - 1 maintainer
arag 0.1.0
A CLI tool for creating, managing, and querying .arag files for RAG applications
1 version - Latest release: about 2 months ago - 105 downloads last month - 1 stars on GitHub - 1 maintainer
peslac 0.1.4
A Python package for the Peslac API
5 versions - Latest release: 3 months ago - 158 downloads last month - 0 stars on GitHub - 1 maintainer
pdf-parser-header-footer 0.1.10
A Python package for processing PDFs with header and footer detection
10 versions - Latest release: 24 days ago - 131 downloads last month - 1 maintainer
atai-ebook-tool 0.0.4
A command-line tool for parsing ebooks (such as EPUB and MOBI) and converting them into a structu...
3 versions - Latest release: 25 days ago - 304 downloads last month - 0 stars on GitHub - 1 maintainer
pdfsegmenter 0.1
This library builds a graph-representation of the content of PDFs. The graph is then clustered, r...
1 version - Latest release: over 4 years ago - 1 dependent repositories - 38 downloads last month - 22 stars on GitHub - 1 maintainer
aimq 0.1.0
A robust message queue processor for Supabase pgmq with AI-powered document processing capabilities
1 version - Latest release: 3 months ago - 83 downloads last month - 0 stars on GitHub - 1 maintainer
spanish-pdf-parser 0.1.0
A Python package for processing PDFs with header and footer detection
1 version - Latest release: 3 months ago - 62 downloads last month - 1 maintainer
atai-gemma3-tool 0.0.1
CLI tool for generating text from images using the Gemma 3 model.
1 version - Latest release: about 1 month ago - 1 maintainer
Related Keywords
ai 14 pdf 11 llm 11 rag 9 text-extraction 9 ocr 9 machine-learning 6 text-processing 5 openai 5 nlp 4 image-to-text 4 python3 4 markdown 3 unstructured-data 3 information-extraction 3 structured-data 3 agent 3 pdf-to-markdown 3 chunking 3 ai-agent 3 ai-agents-framework 3 artificial-intelligence 3 boilerplate-application 3 chromadb 3 gen-ai 3 embeddings 3 image-extraction 3 document 2 retrieval-augmented-generation 2 document-parsing 2 pdf-to-text 2 text-analysis 2 document-indexing 2 pdf-processing 2 vector-search 2 document-management 2 parser 2 data-extraction 2 document-intelligence 2 document-understanding 2 content-extraction 2 metadata 2 file-processing 2 document-ocr 2 llms 2 document-analysis 2 document-classification 2 natural-language-processing 2 ml 2 generative-ai 2 semantic-analysis 2 format-detection 2 vector-database 2 file-conversion 2 table-extraction 2 docx 2 mime-type 1 pdf-to-image 1 pdf-to-table 1 pdf-to-markdown-converter 1 pdf-to-markdown-tool 1 pdf-to-markdown-ai 1 table-detection 1 ai-powered-pdf-processing 1 advanced-pdf-processing 1 document-structure-preservation 1 high-quality-image-extraction 1 advanced-ml-models 1 content-description 1 generation 1 local-files 1 url-support 1 docling 1 markdrop 1 marker 1 office-documents 1 pdf-parsing 1 powerpoint 1 text-analytics 1 text-mining 1 intelligent-document-processing 1 text-parsing 1 language-detection 1 text-recognition 1 multi-modal 1 tika 1 asyncio 1 word-documents 1 java 1 metadata-extraction 1 amazon-bedrock 1 converter 1 image-analysis 1 markitdown 1 image-processing 1 image-ocr 1 pdf-ocr 1 pdf-splitter 1 pdf-merger 1 python 1