pypi.org "document-processing" keyword
View the packages on the pypi.org package registry that are tagged with the "document-processing" keyword.
ai-chunking 0.1.9
A powerful Python library for semantic document chunking and enrichment using AI8 versions - Latest release: about 1 month ago - 1.45 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pyrhubarb 0.0.6
A Python framework for multi-modal document understanding with generative AI6 versions - Latest release: 4 days ago - 43.9 thousand downloads last month - 81 stars on GitHub - 1 maintainer
qdrant-loader 0.1.1
A tool for collecting and vectorizing technical content from multiple sources and storing it in a...4 versions - Latest release: 10 days ago - 285 downloads last month - 2 stars on GitHub - 1 maintainer
kiss-ai-stack-server 0.1.0a17
KISS AI Stack's Server stub - Simplify AI Agent Development18 versions - Latest release: 4 months ago - 608 downloads last month - 1 stars on GitHub - 1 maintainer
magicconvert 0.1.0
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...2 versions - Latest release: 2 months ago - 106 downloads last month - 1 stars on GitHub - 1 maintainer
kreuzberg 3.1.3
A text extraction library supporting PDFs, images, office documents and more19 versions - Latest release: 9 days ago - 6.53 thousand downloads last month - 1,736 stars on GitHub - 1 maintainer
fileseek 0.1.3
FileSeek – AI-Powered Local Document Archive&Search3 versions - Latest release: 2 months ago - 505 downloads last month - 1 maintainer
contextgem 0.1.1
Easier and faster way to build LLM extraction workflows through powerful abstractions3 versions - Latest release: 12 days ago - 338 downloads last month - 35 stars on GitHub - 1 maintainer
bytemesumai 0.1.1
Building Blocks for Robust and Context-Aware Retrieval-Augmented Generation2 versions - Latest release: about 1 month ago - 194 downloads last month - 1 stars on GitHub - 1 maintainer
raggy 0.2.6 💰
scraping stuff18 versions - Latest release: 4 months ago - 1 dependent package - 400 downloads last month - 18 stars on GitHub - 1 maintainer
kiss-ai-stack-core 0.1.0
KISS AI Stack's RAG builder core26 versions - Latest release: 5 months ago - 757 downloads last month - 1 stars on GitHub - 1 maintainer
cucaracha 0.6.0 💰
Mr. Franz Cucaracha will be glad to assist you to the document analysis and processing routine6 versions - Latest release: 3 months ago - 212 downloads last month - 1 stars on GitHub - 1 maintainer
kiss-ai-stack-types 0.1.0a4
KISS AI Stack's common object types4 versions - Latest release: 4 months ago - 141 downloads last month - 0 stars on GitHub - 1 maintainer
kiss-ai-stack-client 0.1.0a2
KISS AI Stack's Python Client SDK - Simplify AI Agent Development3 versions - Latest release: 4 months ago - 119 downloads last month - 1 stars on GitHub - 1 maintainer
indoxminer 0.1.5
Indox Data Extraction19 versions - Latest release: 2 months ago - 632 downloads last month - 18 stars on GitHub - 2 maintainers
atai-pdf-tool 0.1.1
A tool for parsing and extracting text from PDF files with OCR capabilities5 versions - Latest release: about 2 months ago - 277 downloads last month - 0 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.6 versions - Latest release: 3 months ago - 214 downloads last month - 1 stars on GitHub - 1 maintainer
markdrop 0.3.1
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered fe...19 versions - Latest release: 3 months ago - 886 downloads last month - 84 stars on GitHub - 1 maintainer
llamasearch-pdf-llamasearch 0.1.0
A comprehensive PDF processing toolkit for document workflows1 version - Latest release: 14 days ago
smart-llm-loader 0.1.0
A powerful PDF processing toolkit that seamlessly integrates with LLMs for intelligent document c...1 version - Latest release: 2 months ago - 78 downloads last month - 54 stars on GitHub - 1 maintainer
arag 0.1.0
A CLI tool for creating, managing, and querying .arag files for RAG applications1 version - Latest release: about 2 months ago - 105 downloads last month - 1 stars on GitHub - 1 maintainer
peslac 0.1.4
A Python package for the Peslac API5 versions - Latest release: 3 months ago - 158 downloads last month - 0 stars on GitHub - 1 maintainer
pdf-parser-header-footer 0.1.10
A Python package for processing PDFs with header and footer detection10 versions - Latest release: 24 days ago - 131 downloads last month - 1 maintainer
atai-ebook-tool 0.0.4
A command-line tool for parsing ebooks (such as EPUB and MOBI) and converting them into a structu...3 versions - Latest release: 25 days ago - 304 downloads last month - 0 stars on GitHub - 1 maintainer
pdfsegmenter 0.1
This library builds a graph-representation of the content of PDFs. The graph is then clustered, r...1 version - Latest release: over 4 years ago - 1 dependent repositories - 38 downloads last month - 22 stars on GitHub - 1 maintainer
aimq 0.1.0
A robust message queue processor for Supabase pgmq with AI-powered document processing capabilities1 version - Latest release: 3 months ago - 83 downloads last month - 0 stars on GitHub - 1 maintainer
spanish-pdf-parser 0.1.0
A Python package for processing PDFs with header and footer detection1 version - Latest release: 3 months ago - 62 downloads last month - 1 maintainer
atai-gemma3-tool 0.0.1
CLI tool for generating text from images using the Gemma 3 model.1 version - Latest release: about 1 month ago - 1 maintainer
Related Keywords
ai
14
pdf
11
llm
11
rag
9
text-extraction
9
ocr
9
machine-learning
6
text-processing
5
openai
5
nlp
4
image-to-text
4
python3
4
markdown
3
unstructured-data
3
information-extraction
3
structured-data
3
agent
3
pdf-to-markdown
3
chunking
3
ai-agent
3
ai-agents-framework
3
artificial-intelligence
3
boilerplate-application
3
chromadb
3
gen-ai
3
embeddings
3
image-extraction
3
document
2
retrieval-augmented-generation
2
document-parsing
2
pdf-to-text
2
text-analysis
2
document-indexing
2
pdf-processing
2
vector-search
2
document-management
2
parser
2
data-extraction
2
document-intelligence
2
document-understanding
2
content-extraction
2
metadata
2
file-processing
2
document-ocr
2
llms
2
document-analysis
2
document-classification
2
natural-language-processing
2
ml
2
generative-ai
2
semantic-analysis
2
format-detection
2
vector-database
2
file-conversion
2
table-extraction
2
docx
2
mime-type
1
pdf-to-image
1
pdf-to-table
1
pdf-to-markdown-converter
1
pdf-to-markdown-tool
1
pdf-to-markdown-ai
1
table-detection
1
ai-powered-pdf-processing
1
advanced-pdf-processing
1
document-structure-preservation
1
high-quality-image-extraction
1
advanced-ml-models
1
content-description
1
generation
1
local-files
1
url-support
1
docling
1
markdrop
1
marker
1
office-documents
1
pdf-parsing
1
powerpoint
1
text-analytics
1
text-mining
1
intelligent-document-processing
1
text-parsing
1
language-detection
1
text-recognition
1
multi-modal
1
tika
1
asyncio
1
word-documents
1
java
1
metadata-extraction
1
amazon-bedrock
1
converter
1
image-analysis
1
markitdown
1
image-processing
1
image-ocr
1
pdf-ocr
1
pdf-splitter
1
pdf-merger
1
python
1