An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org : unstructured-cpu

A library that prepares raw documents for downstream ML tasks.

Registry - Source - Documentation - JSON
purl: pkg:pypi/unstructured-cpu
Keywords: NLP , PDF , HTML , CV , XML , parsing , preprocessing , data-pipelines , deep-learning , document-image-analysis , document-image-processing , document-parser , document-parsing , docx , donut , information-retrieval , langchain , llm , machine-learning , ml , natural-language-processing , nlp , ocr , pdf , pdf-to-json , pdf-to-text
License: Apache-2.0
Latest release: 8 months ago
First release: 8 months ago
Downloads: 368 last month
Stars: 10,877 on GitHub
Forks: 904 on GitHub
Total Commits: 1621
Committers: 116
Average commits per author: 13.974
Development Distribution Score (DDS): 0.84
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: about 5 hours ago

Top 1.5% on pypi.org
unstructured 0.17.0
A library that prepares raw documents for downstream ML tasks.
181 versions - Latest release: about 1 month ago - 113 dependent packages - 3,374 dependent repositories - 2.7 million downloads last month - 9,368 stars on GitHub - 1 maintainer