pypi.org : py-data-juicer
A One-Stop Data Processing System for Large Language Models.
Registry
-
Source
- Documentation
- JSON
purl: pkg:pypi/py-data-juicer
Keywords:
chinese
, data-analysis
, data-science
, data-visualization
, dataset
, gpt
, gpt-4
, instruction-tuning
, large-language-models
, llama
, llava
, llm
, llms
, multi-modal
, nlp
, opendata
, pre-training
, pytorch
, sora
, streamlit
License: Apache-2.0
Latest release: 11 days ago
First release: over 1 year ago
Downloads: 903 last month
Stars: 1,321 on GitHub
Forks: 78 on GitHub
Total Commits: 63
Committers: 13
Average commits per author: 4.846
Development Distribution Score (DDS): 0.635
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Last synced: 11 days ago
data-juicer
3 packages2,926 downloads