pypi.org "document-parser" keyword
View the packages on the pypi.org package registry that are tagged with the "document-parser" keyword.
semantic-ai 0.0.6
Sematic AI RAG System8 versions - Latest release: about 1 year ago - 32 downloads last month - 18 stars on GitHub - 1 maintainer
marie-ai 3.0.29
Python library to Integrate AI-powered features into your applications7 versions - Latest release: over 1 year ago - 24 downloads last month - 72 stars on GitHub - 1 maintainer
docling-enhanced 2.32.0
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...1 version - Latest release: 4 months ago - 31 downloads last month - 37,086 stars on GitHub - 1 maintainer
vision-parse 0.1.13
Parse PDF documents into markdown formatted content using Vision LLMs14 versions - Latest release: 7 months ago - 1.7 thousand downloads last month - 423 stars on GitHub - 1 maintainer
graphlit-client 1.0.20250904001
Graphlit API Python Client177 versions - Latest release: about 20 hours ago - 1.4 thousand downloads last month - 5 stars on GitHub - 1 maintainer
opendataloader-pdf 0.0.9
A Python wrapper for the opendataloader-pdf Java CLI.6 versions - Latest release: 1 day ago - 290 downloads last month - 6 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
192 versions - Latest release: 10 days ago - 113 dependent packages - 3,374 dependent repositories - 3.47 million downloads last month - 12,544 stars on GitHub - 1 maintainer
unstructured 0.18.14
A library that prepares raw documents for downstream ML tasks.192 versions - Latest release: 10 days ago - 113 dependent packages - 3,374 dependent repositories - 3.47 million downloads last month - 12,544 stars on GitHub - 1 maintainer
unstructured-cpu 0.15.1
A library that prepares raw documents for downstream ML tasks.13 versions - Latest release: about 1 year ago - 104 downloads last month - 12,544 stars on GitHub - 1 maintainer
llama-index-readers-llama-parse 0.5.0
llama-index readers llama-parse integration9 versions - Latest release: about 1 month ago - 6 dependent packages - 2.22 million downloads last month - 3,956 stars on GitHub - 1 maintainer
llama-cloud-services 0.6.63
Tailored SDK clients for LlamaCloud services.62 versions - Latest release: 3 days ago - 9.48 million downloads last month - 3,956 stars on GitHub - 1 maintainer
docstrange 1.1.5
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...16 versions - Latest release: 3 days ago - 1.93 thousand downloads last month - 493 stars on GitHub - 1 maintainer
llama-index-readers-docling 0.4.0
llama-index readers docling integration7 versions - Latest release: about 1 month ago - 14.2 thousand downloads last month - 27,013 stars on GitHub - 1 maintainer
novalad 0.1.15
Novalad: AI-powered platform for transforming unstructured documents like PDFs and PowerPoints in...16 versions - Latest release: 4 days ago - 43 downloads last month - 17 stars on GitHub - 1 maintainer
anyparser-crewai 0.0.2
Anyparser CrewAI Integration2 versions - Latest release: 7 months ago - 29 downloads last month - 1 stars on GitHub - 1 maintainer
mixedbread-ai-langchain 1.0.2
The official Mixedbread AI integration for LangChain.2 versions - Latest release: 2 months ago - 76 downloads last month - 0 stars on GitHub - 1 maintainer
openparse 0.7.0
Streamlines the process of preparing documents for LLM's.17 versions - Latest release: 10 months ago - 3.54 thousand downloads last month - 3,048 stars on GitHub - 1 maintainer
llm-parse 0.1.5
Parse data from documents optimised for downstream llm tasks.6 versions - Latest release: 2 months ago - 103 downloads last month - 3,859 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
54 versions - Latest release: 8 days ago - 14 dependent repositories - 4.32 thousand downloads last month - 2,931 stars on GitHub - 1 maintainer
deepdoctection 0.45.0
Repository for Document AI54 versions - Latest release: 8 days ago - 14 dependent repositories - 4.32 thousand downloads last month - 2,931 stars on GitHub - 1 maintainer
extended-docling 2.12.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...1 version - Latest release: 8 months ago - 22 downloads last month - 36,525 stars on GitHub - 1 maintainer
autorag 0.3.17 💰
Automatically Evaluate RAG pipelines with your own data. Find optimal structure for new RAG product.69 versions - Latest release: 10 days ago - 1.08 thousand downloads last month - 4,151 stars on GitHub - 1 maintainer
docling-google-ocr 2.13.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...2 versions - Latest release: 7 months ago - 32 downloads last month - 36,025 stars on GitHub - 1 maintainer
python-docparser 1.1.0
Extract text from your docx document.3 versions - Latest release: over 2 years ago - 176 downloads last month - 11 stars on GitHub - 1 maintainer
llamarker 1.0.2
A universal GenAI-based local parser for complex documents of all types.3 versions - Latest release: 8 months ago - 24 downloads last month - 1 stars on GitHub - 1 maintainer
clearedge 0.1.17
Build a RAG preprocessing pipeline18 versions - Latest release: over 1 year ago - 74 downloads last month - 11 stars on GitHub - 1 maintainer
df-extract 0.0.2
DecisionFacts Extraction Library extracts content from PDF, PPTX, Docx, png, jpg., and convert as...3 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 26 downloads last month - 14 stars on GitHub - 1 maintainer
llama-index-node-parser-docling 0.4.0
llama-index node_parser docling integration6 versions - Latest release: about 1 month ago - 23.9 thousand downloads last month - 28,777 stars on GitHub - 1 maintainer
Related Keywords
pdf
17
pdf-to-json
14
document-parsing
13
pdf-to-text
12
pptx
10
tables
10
markdown
9
docx
9
ai
8
ocr
8
document
7
llm
7
rag
6
html
6
documents
6
pdf-converter
6
pdf-to-markdown
6
python
5
convert
5
xlsx
5
parsing
5
nlp
5
PDF
4
structured-data
4
table-detection
4
retrieval-augmented-generation
4
machine-learning
4
langchain
4
ppt-to-json
3
table-recognition
3
docling
3
pdf-to-excel
3
pdf-document-processor
3
docx-to-markdown
3
layout model
3
segmentation
3
table structure
3
table former
3
document-image-analysis
3
ppt-to-markdown
3
deep-learning
3
pdf-parser
2
text-extraction
2
NLP
2
HTML
2
CV
2
XML
2
preprocessing
2
data-pipelines
2
document-image-processing
2
donut
2
information-retrieval
2
ml
2
natural-language-processing
2
llama
2
document-understanding
2
document-ai
2
genai
2
document parsing
2
AI
2
pubtabnet
2
pytorch
2
embedding
2
publaynet
2
document-layout-analysis
2
knowledge-graph
1
typescript
1
mixedbread-ai
1
reranking
1
document-loader
1
extraction
1
retrieval
1
document-structure
1
layout-parsing
1
layoutlm
1
tensorflow
1
RAG
1
AutoRAG
1
autorag
1
rag-evaluation
1
pdf-ocr-extraction
1
PowerPoint
1
data extraction
1
unstructured data
1
layout parser
1
api
1
layout-parser
1
python3
1
novalad
1
anyparser
1
artificial-intelligence
1
cache-augmented-generation
1
cag
1
crew-ai
1
crew-ai-rag
1
crewai
1
crewai-rag
1
kag
1
llm-evaluation
1
llm-ops
1