proxy.golang.org : github.com/unstructured-io/unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
      Registry
      -
      Source
      - Documentation
    - JSON
    - codemeta.json
    
    purl: pkg:golang/github.com/unstructured-io/unstructured
      
 Keywords: 
        data-pipelines
        , deep-learning
        , document-image-analysis
        , document-image-processing
        , document-parser
        , document-parsing
        , docx
        , donut
        , information-retrieval
        , langchain
        , llm
        , machine-learning
        , ml
        , natural-language-processing
        , nlp
        , ocr
        , pdf
        , pdf-to-json
        , pdf-to-text
        , preprocessing
      
License: Apache-2.0
        
Latest release: about 2 months ago
      
Namespace: github.com/unstructured-io
    
      
Stars: 12,775 on GitHub
      
Forks: 1,042 on GitHub
      
Total Commits: 1621
      
Committers: 116
      
Average commits per author: 13.974
      
Development Distribution Score (DDS): 0.84
      
More commit stats: commits.ecosyste.ms
      
See more repository details: repos.ecosyste.ms
      
Last synced: about 2 months ago