pypi.org : unstructured-cpu
A library that prepares raw documents for downstream ML tasks.
Registry
-
Source
- Documentation
- JSON
purl: pkg:pypi/unstructured-cpu
Keywords:
NLP
, PDF
, HTML
, CV
, XML
, parsing
, preprocessing
, data-pipelines
, deep-learning
, document-image-analysis
, document-image-processing
, document-parser
, document-parsing
, docx
, donut
, information-retrieval
, langchain
, llm
, machine-learning
, ml
, natural-language-processing
, nlp
, ocr
, pdf
, pdf-to-json
, pdf-to-text
License: Apache-2.0
Latest release: 8 months ago
First release: 8 months ago
Downloads: 368 last month
Stars: 10,877 on GitHub
Forks: 904 on GitHub
Total Commits: 1621
Committers: 116
Average commits per author: 13.974
Development Distribution Score (DDS): 0.84
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: about 22 hours ago