An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "large-scale-data-processing" keyword

data-prep-toolkit-transforms-ray 0.2.1
Data Preparation Toolkit Transforms using Ray
5 versions - Latest release: over 1 year ago - 11 downloads last month - 622 stars on GitHub - 2 maintainers
data-prep-toolkit-lang 1.0.0a0
Data Preparation Toolkit Transforms using Ray
2 versions - Latest release: about 1 year ago - 24 downloads last month - 622 stars on GitHub - 1 maintainer
data-prep-toolkit-transforms 1.1.7
Data Preparation Toolkit Transforms using Ray
53 versions - Latest release: 26 days ago - 14.5 thousand downloads last month - 646 stars on GitHub - 4 maintainers
invisible-unicorn 0.4.0
Scalable Data Preprocessing Tool for Training Large Language Models
1 version - Latest release: over 1 year ago - 10 downloads last month - 1,111 stars on GitHub - 1 maintainer
nemo-curator 1.0.0
Scalable Data Preprocessing Tool for Training Large Language Models
19 versions - Latest release: 5 months ago - 2.49 thousand downloads last month - 1,130 stars on GitHub - 6 maintainers
data-prep-toolkit-transforms-lang1 0.2.2
Data Preparation Toolkit Transforms
2 versions - Latest release: over 1 year ago - 38 downloads last month - 622 stars on GitHub - 1 maintainer
invisible-rabbit 0.5.0
Scalable Data Preprocessing Tool for Training Large Language Models
4 versions - Latest release: over 1 year ago - 16 downloads last month - 1,130 stars on GitHub - 1 maintainer
data-prep-toolkit-idiud 1.1.0
Subset of Data Preparation Toolkit Transforms
1 version - Latest release: 10 months ago - 19 downloads last month - 646 stars on GitHub - 1 maintainer