proxy.golang.org : github.com/unstructured-io/unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
Registry
-
Source
- Documentation
- JSON
purl: pkg:golang/github.com/unstructured-io/unstructured
Keywords:
data-pipelines
, deep-learning
, document-image-analysis
, document-image-processing
, document-parser
, document-parsing
, docx
, donut
, information-retrieval
, langchain
, llm
, machine-learning
, ml
, natural-language-processing
, nlp
, ocr
, pdf
, pdf-to-json
, pdf-to-text
, preprocessing
License: Apache-2.0
Latest release: 17 days ago
Namespace: github.com/unstructured-io
Stars: 12,775 on GitHub
Forks: 1,042 on GitHub
Total Commits: 1621
Committers: 116
Average commits per author: 13.974
Development Distribution Score (DDS): 0.84
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: 17 days ago