pypi.org "llmapps" keyword
View the packages on the pypi.org package registry that are tagged with the "llmapps" keyword.
nemo-curator 0.7.1
Scalable Data Preprocessing Tool for Training Large Language Models16 versions - Latest release: 20 days ago - 1.34 thousand downloads last month - 873 stars on GitHub - 5 maintainers
data-prep-toolkit-spark 0.2.0
Data Preparation Toolkit Library for Spark3 versions - Latest release: 10 months ago - 105 downloads last month - 3 maintainers
invisible-rabbit 0.5.0
Scalable Data Preprocessing Tool for Training Large Language Models4 versions - Latest release: 6 months ago - 172 downloads last month - 868 stars on GitHub - 1 maintainer
invisible-unicorn 0.4.0
Scalable Data Preprocessing Tool for Training Large Language Models1 version - Latest release: 7 months ago - 46 downloads last month - 868 stars on GitHub - 1 maintainer
data-prep-toolkit-ray 0.2.1
Data Preparation Toolkit Library for Ray8 versions - Latest release: 7 months ago - 1.35 thousand downloads last month - 4 maintainers
data-prep-toolkit-transforms 1.1.0
Data Preparation Toolkit Transforms using Ray26 versions - Latest release: about 1 month ago - 8.84 thousand downloads last month - 531 stars on GitHub - 3 maintainers
data-prep-toolkit 0.2.4
Data Preparation Toolkit Library for Ray and Python27 versions - Latest release: about 1 month ago - 1 dependent package - 11 thousand downloads last month - 7 maintainers
data-prep-toolkit-transforms-lang1 0.2.2
Data Preparation Toolkit Transforms2 versions - Latest release: 7 months ago - 93 downloads last month - 531 stars on GitHub - 1 maintainer
data-prep-toolkit-lang 1.0.0a0
Data Preparation Toolkit Transforms using Ray2 versions - Latest release: 4 months ago - 100 downloads last month - 531 stars on GitHub - 1 maintainer
data-prep-toolkit-transforms-ray 0.2.1
Data Preparation Toolkit Transforms using Ray5 versions - Latest release: 7 months ago - 198 downloads last month - 531 stars on GitHub - 2 maintainers
data-prep-connector 0.2.3
Scalable and Compliant Web Crawler5 versions - Latest release: 5 months ago - 8.26 thousand downloads last month - 2 maintainers
data-prep-toolkit-flows 0.2.0
Data Preparation Toolkit Library for creation and execution of ttansformers flows3 versions - Latest release: 9 months ago - 128 downloads last month - 1 maintainer
dpk-tokenization-transform-python removed
Tokenization Transform for Python1 version
Related Keywords
llm
13
fine-tuning
13
data
13
ai
10
generative
10
data preparation
9
data preprocessing
9
python
7
large-scale-data-processing
7
large-language-models
7
deduplication
7
datarecipes
7
datacuration
7
data-preparation
7
data-prep
7
data-preprocessing-pipelines
4
spark
4
data-preprocessing
4
finetuning
4
code-quality
4
malware
4
ray
4
transforms
4
semantic-deduplication
3
llm-data-quality
3
fast-data-processing
3
data-quality
3
data-processing-pipelines
3
data-processing
3
data-curation
3
data acquisition
1
crawler
1
web crawler
1
0b74b5a
1
tokenizer
1