An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

Top 6.7% on proxy.golang.org

proxy.golang.org : github.com/alibaba/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Registry - Source - Documentation - JSON - codemeta.json
purl: pkg:golang/github.com/alibaba/data-juicer
Keywords: chinese , data-analysis , data-science , data-visualization , dataset , gpt , gpt-4 , instruction-tuning , large-language-models , llama , llava , llm , llms , multi-modal , nlp , opendata , pre-training , pytorch , sora , streamlit
License: Apache-2.0
Latest release: 3 months ago
First release: over 2 years ago
Stars: 1,321 on GitHub
Forks: 78 on GitHub
Total Commits: 63
Committers: 13
Average commits per author: 4.846
Development Distribution Score (DDS): 0.635
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: 19 days ago

    Loading...
    Readme
    Loading...