Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
pypi.org "pdf-to-text" keyword
Top 1.5% on pypi.org
133 versions - Latest release: 3 days ago - 113 dependent packages - 3,374 dependent repositories - 1.13 million downloads last month - 4,064 stars on GitHub - 1 maintainer
unstructured 0.14.0
A library that prepares raw documents for downstream ML tasks.133 versions - Latest release: 3 days ago - 113 dependent packages - 3,374 dependent repositories - 1.13 million downloads last month - 4,064 stars on GitHub - 1 maintainer
pcu-pdf 1.2.2
PDF parser component (Apache Tika) for PCU project2 versions - Latest release: over 5 years ago - 1 dependent repositories - 26 downloads last month - 1 stars on GitHub - 1 maintainer
pcu-io 1.2.2
IO management for PCU project2 versions - Latest release: over 5 years ago - 1 dependent repositories - 21 downloads last month - 0 stars on GitHub - 1 maintainer
clearedge 0.1.17
Build a RAG preprocessing pipeline18 versions - Latest release: about 2 months ago - 332 downloads last month - 7 stars on GitHub - 1 maintainer
Related Keywords
pdf
4
pdf-to-json
2
ocr
2
parser
2
pcu
2
llm
2
langchain
2
python
2
document-parser
2
apache
1
component
1
pdf-parser-component
1
tika
1
input-output
1
json
1
json-to-text
1
pcu-io
1
text
1
haystack
1
llamaindex
1
pdf-ocr-extraction
1
rag-pipeline
1
retrieval-augmented-generation
1
table-detection
1
table-recognition
1
NLP
1
PDF
1
HTML
1
CV
1
XML
1
parsing
1
preprocessing
1
data-pipelines
1
deep-learning
1
document-image-analysis
1
document-image-processing
1
document-parsing
1
docx
1
donut
1
information-retrieval
1
machine-learning
1
ml
1
natural-language-processing
1
nlp
1