pypi.org "data-processing-pipelines" keyword
View the packages on the pypi.org package registry that are tagged with the "data-processing-pipelines" keyword.
nemo-curator 0.7.1
Scalable Data Preprocessing Tool for Training Large Language Models16 versions - Latest release: 22 days ago - 1.5 thousand downloads last month - 879 stars on GitHub - 5 maintainers
invisible-unicorn 0.4.0
Scalable Data Preprocessing Tool for Training Large Language Models1 version - Latest release: 7 months ago - 54 downloads last month - 879 stars on GitHub - 1 maintainer
invisible-rabbit 0.5.0
Scalable Data Preprocessing Tool for Training Large Language Models4 versions - Latest release: 6 months ago - 213 downloads last month - 879 stars on GitHub - 1 maintainer
convtools 1.14.4 💰
dynamic, declarative data transformations with automatic code generation113 versions - Latest release: about 1 month ago - 3.04 thousand downloads last month - 40 stars on GitHub - 1 maintainer
artifician 0.6.4
Artifician is an event driven framework developed to simplify the process of preparation of the d...35 versions - Latest release: about 1 year ago - 1.22 thousand downloads last month - 10 stars on GitHub - 1 maintainer
thepipe 1.3.8
A lightweight, general purpose pipeline framework.15 versions - Latest release: almost 3 years ago - 1 dependent package - 2 dependent repositories - 1.25 thousand downloads last month - 14 stars on GitHub - 2 maintainers
graphbook 0.13.3
The AI-driven data pipeline and workflow framework for data scientists and machine learning engin...23 versions - Latest release: 11 days ago - 957 downloads last month - 35 stars on GitHub - 1 maintainer
graphbook_huggingface 0.0.6
Graphbook Hugging Face Plugin for no-code Hugging Face AI pipelines5 versions - Latest release: 19 days ago - 237 downloads last month - 35 stars on GitHub - 1 maintainer
Related Keywords
data-processing
7
python
6
machine-learning
3
data-science
3
semantic-deduplication
3
llmapps
3
llm-data-quality
3
llm
3
large-scale-data-processing
3
large-language-models
3
data
3
data-curation
3
data-prep
3
data-preparation
3
data-quality
3
fine-tuning
3
fast-data-processing
3
deduplication
3
datarecipes
3
datacuration
3
research
2
ai
2
machine learning
2
data science
2
pytorch
2
framework
2
workflow
2
ml
2
pipelines
2
huggingface
1
provenance
1
hacktoberfest
1
dataset-preparation
1
artificial-intelligence
1
artifician
1
transformations
1
parsing
1
data-analysis
1
csv-converter
1
csv
1
conversions
1
code-generation
1
convtools
1
codegen
1
converters
1
etl
1