pypi.org "large-scale-data-processing" keyword
data-prep-toolkit-transforms-ray 0.2.1
Data Preparation Toolkit Transforms using Ray5 versions - Latest release: over 1 year ago - 11 downloads last month - 622 stars on GitHub - 2 maintainers
data-prep-toolkit-lang 1.0.0a0
Data Preparation Toolkit Transforms using Ray2 versions - Latest release: about 1 year ago - 24 downloads last month - 622 stars on GitHub - 1 maintainer
data-prep-toolkit-transforms 1.1.7
Data Preparation Toolkit Transforms using Ray53 versions - Latest release: 26 days ago - 14.5 thousand downloads last month - 646 stars on GitHub - 4 maintainers
invisible-unicorn 0.4.0
Scalable Data Preprocessing Tool for Training Large Language Models1 version - Latest release: over 1 year ago - 10 downloads last month - 1,111 stars on GitHub - 1 maintainer
nemo-curator 1.0.0
Scalable Data Preprocessing Tool for Training Large Language Models19 versions - Latest release: 5 months ago - 2.49 thousand downloads last month - 1,130 stars on GitHub - 6 maintainers
data-prep-toolkit-transforms-lang1 0.2.2
Data Preparation Toolkit Transforms2 versions - Latest release: over 1 year ago - 38 downloads last month - 622 stars on GitHub - 1 maintainer
invisible-rabbit 0.5.0
Scalable Data Preprocessing Tool for Training Large Language Models4 versions - Latest release: over 1 year ago - 16 downloads last month - 1,130 stars on GitHub - 1 maintainer
data-prep-toolkit-idiud 1.1.0
Subset of Data Preparation Toolkit Transforms1 version - Latest release: 10 months ago - 19 downloads last month - 646 stars on GitHub - 1 maintainer
Related Keywords
python
8
llm
8
large-language-models
8
fine-tuning
8
llmapps
8
data
8
data-prep
8
data-preparation
8
deduplication
8
datarecipes
8
datacuration
8
spark
5
ray
5
malware
5
finetuning
5
data-preprocessing-pipelines
5
data-preprocessing
5
code-quality
5
ai
5
generative
5
data preparation
5
data preprocessing
5
transforms
5
data-curation
3
data-processing
3
data-processing-pipelines
3
data-quality
3
fast-data-processing
3
llm-data-quality
3
semantic-deduplication
3