pypi.org "dataset" keyword
View the packages on the pypi.org package registry that are tagged with the "dataset" keyword.
clloader 0.0.2
A DataLoader library for Continual Learning in PyTorch.2 versions - Latest release: about 5 years ago - 82 downloads last month - 428 stars on GitHub - 1 maintainer
waymo-open-dataset-tf-2-2-0 1.3.1
Waymo Open Dataset libraries.3 versions - Latest release: almost 4 years ago - 2 dependent packages - 5 dependent repositories - 333 downloads last month - 2 maintainers
pyautoplot 1.0.1
PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by gene...2 versions - Latest release: 3 months ago - 123 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
20 versions - Latest release: about 2 years ago - 1 dependent package - 3 dependent repositories - 6.26 thousand downloads last month - 25 stars on GitHub - 1 maintainer
path-dict 4.0.0
Extends Python's dict with useful extras20 versions - Latest release: about 2 years ago - 1 dependent package - 3 dependent repositories - 6.26 thousand downloads last month - 25 stars on GitHub - 1 maintainer
ecko-cli 1.6.0
CLI tool that easily converts a directory of images into a dataset for training generative ai models8 versions - Latest release: 5 months ago - 345 downloads last month - 1 stars on GitHub - 1 maintainer
spltr 0.3.2
A simple PyTorch-based data loader and splitter3 versions - Latest release: over 5 years ago - 1 dependent repositories - 138 downloads last month - 1 stars on GitHub - 1 maintainer
rads 0.1.0
Python front end for the Radar Altimeter Database System.2 versions - Latest release: over 5 years ago - 1 dependent repositories - 95 downloads last month - 2 stars on GitHub - 1 maintainer
graviti 0.13.1
Graviti Python SDK56 versions - Latest release: about 2 years ago - 1 dependent repositories - 1.61 thousand downloads last month - 12 stars on GitHub - 1 maintainer
ecodatatk 0.0.1
Developed for limnological and hydrological studies1 version - Latest release: almost 5 years ago - 1 dependent repositories - 56 downloads last month - 0 stars on GitHub - 1 maintainer
factuality 1.0.14
Benchmarking long-form factuality in large language models. Original code for our paper "Long-for...4 versions - Latest release: about 1 year ago - 177 downloads last month - 575 stars on GitHub - 1 maintainer
dgit_extensions 0.1.3
dgit addons2 versions - Latest release: about 9 years ago - 2 dependent repositories - 57 downloads last month - 0 stars on GitHub - 1 maintainer
chrome-fingerprints 1.1
A Collection of 10.000 self-collected Chrome Fingerprints. Wrapped in a easy-to-use API, availabl...2 versions - Latest release: over 1 year ago - 1 dependent package - 1.53 thousand downloads last month - 220 stars on GitHub - 1 maintainer
target-datadotworld 1.0.1
Singer target for data.world5 versions - Latest release: almost 7 years ago - 1 dependent repositories - 154 downloads last month - 5 stars on GitHub - 1 maintainer
soramimi-phonetic-search-dataset 0.0.9
音韻的類似性を考慮した検索システムの評価用データセット。替え歌の歌詞から構築された特定ジャンルの単語ペアを収録。2 versions - Latest release: about 1 month ago - 123 downloads last month - 0 stars on GitHub - 1 maintainer
mldatasetbuilder 1.0.0
MLDatasetBuilder is a python package which is helping to prepare the image for your ML dataset.10 versions - Latest release: almost 5 years ago - 1 dependent repositories - 260 downloads last month - 4 stars on GitHub - 1 maintainer
biobeee 0.0.5
Bioinformatics tool for performing web scrapping on biological database and pre-processing3 versions - Latest release: over 1 year ago - 71 downloads last month - 0 stars on gitlab.com - 1 maintainer
idsprites 1.0.1
Easily generate simple continual learning benchmarks.1 version - Latest release: 9 months ago - 65 downloads last month - 495 stars on GitHub - 1 maintainer
arrowtextclassifier 1.0.3
ArrowTextClassifier is a simple text classification tool written in pytorch that allows you to tr...4 versions - Latest release: 12 months ago - 197 downloads last month - 1 maintainer
gps2var 0.1.0a1
Fast reading of geospatial variables by GPS coordinates2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 91 downloads last month - 7 stars on GitHub - 1 maintainer
get-random-people 1.1.3
Generates random information of a person12 versions - Latest release: over 2 years ago - 514 downloads last month - 0 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
47 versions - Latest release: 5 months ago - 3 dependent packages - 5 dependent repositories - 2.92 thousand downloads last month - 375 stars on GitHub - 3 maintainers
mirdata 0.3.9
Common loaders for MIR datasets.47 versions - Latest release: 5 months ago - 3 dependent packages - 5 dependent repositories - 2.92 thousand downloads last month - 375 stars on GitHub - 3 maintainers
Top 2.2% on pypi.org
25 versions - Latest release: 6 months ago - 23 dependent packages - 94 dependent repositories - 332 thousand downloads last month - 2,237 stars on GitHub - 1 maintainer
colour-science 0.4.6 💰
Colour Science for Python25 versions - Latest release: 6 months ago - 23 dependent packages - 94 dependent repositories - 332 thousand downloads last month - 2,237 stars on GitHub - 1 maintainer
datapush 1.0.1
MySQL Data Generator3 versions - Latest release: 5 months ago - 137 downloads last month - 2 stars on GitHub - 1 maintainer
sqlstring 0.1.2
sqlstring creates SQL query strings.1 version - Latest release: almost 9 years ago - 2 dependent repositories - 24 downloads last month - 1 stars on GitHub - 1 maintainer
glossapi 0.0.10
A library for processing academic texts in Greek and other languages7 versions - Latest release: 4 days ago - 686 downloads last month - 101 stars on GitHub - 2 maintainers
pascal-voc 2.1.12
Tool to work with annotation formats19 versions - Latest release: 3 months ago - 1 dependent repositories - 836 downloads last month - 6 stars on GitHub - 1 maintainer
hirundo 0.1.9
This package is used to interface with Hirundo's platform. It provides a simple API to optimize y...8 versions - Latest release: 4 months ago - 332 downloads last month - 2 stars on GitHub - 2 maintainers
coarij 0.2.8
Corpus of Annual Reports in Japan9 versions - Latest release: over 4 years ago - 1 dependent repositories - 225 downloads last month - 90 stars on GitHub - 1 maintainer
chafic 0.1.10
chakki Financial Report Corpus11 versions - Latest release: over 5 years ago - 288 downloads last month - 90 stars on GitHub - 1 maintainer
mtgdata 0.3.0
MTG image dataset with automatic image scraping and conversion.9 versions - Latest release: 4 days ago - 1 dependent repositories - 520 downloads last month - 4 stars on GitHub - 1 maintainer
dldummygen 0.0.2 💰
Deep-Learning Dummy File Generator by csv File1 version - Latest release: over 4 years ago - 1 dependent repositories - 66 downloads last month - 17,636 stars on GitHub - 1 maintainer
knyfe 0.4.2
A utility for rapid exploration and preprocessing of datasets.1 version - Latest release: almost 13 years ago - 2 dependent repositories - 65 downloads last month - 54 stars on GitHub - 1 maintainer
mahanlp 0.0.2
An NLP Library for Marathi Language11 versions - Latest release: over 2 years ago - 302 downloads last month - 118 stars on GitHub - 1 maintainer
randomlib 4.5
An NLP Library for Marathi Language44 versions - Latest release: about 2 years ago - 840 downloads last month - 102 stars on GitHub - 1 maintainer
demomarlib 0.0.3
An NLP Library for Marathi Language3 versions - Latest release: over 2 years ago - 85 downloads last month - 118 stars on GitHub - 1 maintainer
kyoushi-dataset 0.2.1
Tool for labeling log data from testbeds3 versions - Latest release: about 3 years ago - 1 dependent repositories - 125 downloads last month - 2 stars on GitHub - 1 maintainer
hfdl 0.4.0
Fast and reliable downloader for Hugging Face models and datasets4 versions - Latest release: about 1 month ago - 98 downloads last month - 1 stars on GitHub - 1 maintainer
doccano-multi-label 1.8.4.2 💰
doccano, text annotation tool for machine learning practitioners2 versions - Latest release: 8 months ago - 79 downloads last month - 9,910 stars on GitHub - 1 maintainer
Top 4.0% on pypi.org
31 versions - Latest release: over 1 year ago - 7 dependent repositories - 17.7 thousand downloads last month - 8,436 stars on GitHub - 1 maintainer
doccano 1.8.4 💰
doccano, text annotation tool for machine learning practitioners31 versions - Latest release: over 1 year ago - 7 dependent repositories - 17.7 thousand downloads last month - 8,436 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
7 versions - Latest release: 4 months ago - 7 dependent repositories - 16.9 thousand downloads last month - 4,839 stars on GitHub - 2 maintainers
flwr-datasets 0.5.0
Flower Datasets7 versions - Latest release: 4 months ago - 7 dependent repositories - 16.9 thousand downloads last month - 4,839 stars on GitHub - 2 maintainers
Top 3.5% on pypi.org
33 versions - Latest release: 3 months ago - 2 dependent packages - 5 dependent repositories - 5.98 thousand downloads last month - 13,990 stars on GitHub - 1 maintainer
pix2tex 0.1.4
pix2tex: Using a ViT to convert images of equations into LaTeX code.33 versions - Latest release: 3 months ago - 2 dependent packages - 5 dependent repositories - 5.98 thousand downloads last month - 13,990 stars on GitHub - 1 maintainer
dataset-encoder 1.0.0
Transform your preprocessing stage into a fully automated process!1 version - Latest release: 11 months ago - 63 downloads last month - 0 stars on GitHub - 1 maintainer
dbrecord 1.1.4
sqlite based kv database using for big data IO.10 versions - Latest release: over 2 years ago - 1 dependent package - 3 dependent repositories - 464 downloads last month - 6 stars on GitHub - 1 maintainer
pipelime-python 2.0.0
Data workflows, cli and dataflow automation.28 versions - Latest release: about 1 month ago - 1 dependent repositories - 1.05 thousand downloads last month - 19 stars on GitHub - 1 maintainer
hasy 0.3.1 💰
Tools for the HASY dataset.5 versions - Latest release: over 4 years ago - 1 dependent repositories - 179 downloads last month - 33 stars on GitHub - 1 maintainer
databalancer 0.2.0
Databalancer is the python library dedicated to balance the imbalanced text classification datase...21 versions - Latest release: almost 3 years ago - 1 dependent repositories - 125 downloads last month - 7 stars on GitHub - 1 maintainer
dataset-translator 0.1.5
⚡️ Efficient dataset translation using Google Translate's API6 versions - Latest release: 2 months ago - 261 downloads last month - 1 stars on GitHub - 1 maintainer
agentchef 0.2.7
Comprehensive toolkit for conversation dataset creation, augmentation, and analysis10 versions - Latest release: 5 days ago - 850 downloads last month - 1 stars on GitHub - 1 maintainer
caffe2_db_image_writer 0.1
caffe2 image writer1 version - Latest release: almost 8 years ago - 32 downloads last month - 1 maintainer
zosedit 0.0.15
FTP-based MVS Dataset Editor15 versions - Latest release: 6 months ago - 552 downloads last month - 0 stars on GitHub - 1 maintainer
pardata 0.4.0
A Python API that enables data consumers and distributors to easily use and share datasets, and e...2 versions - Latest release: over 3 years ago - 2 dependent repositories - 187 downloads last month - 17 stars on GitHub - 1 maintainer
doccano-transformer 1.0.2
Format transformer tool for doccano3 versions - Latest release: over 4 years ago - 1 dependent repositories - 99 downloads last month - 107 stars on GitHub - 1 maintainer
bioimageloader 0.1.1
load bioimages for machine learning12 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 368 downloads last month - 7 stars on GitHub - 1 maintainer
dsrl 0.1.0
Datasets for Offline Safe Reinforcement Learning1 version - Latest release: almost 2 years ago - 1 dependent package - 316 downloads last month - 88 stars on GitHub - 1 maintainer
tf-datasets 0.0.1
tensorflow/datasets1 version - Latest release: over 6 years ago - 1 dependent repositories - 73 downloads last month - 4,392 stars on GitHub - 1 maintainer
graphite-datasets 1.0.59
tensorflow/datasets is a library of datasets ready to use with TensorFlow.60 versions - Latest release: about 1 year ago - 1.78 thousand downloads last month - 4,392 stars on GitHub - 1 maintainer
symjax 0.5.0
A Symbolic JAX software12 versions - Latest release: over 4 years ago - 1 dependent repositories - 241 downloads last month - 119 stars on GitHub - 1 maintainer
rstojnic-tfds-nightly 4.6.0.dev202206140947
tensorflow/datasets is a library of datasets ready to use with TensorFlow.3 versions - Latest release: almost 3 years ago - 124 downloads last month - 4,084 stars on GitHub - 1 maintainer
hscitorchutil 0.3.1
HSCI research group utilities for pytorch (lightning)36 versions - Latest release: 3 months ago - 1.45 thousand downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
1 version - Latest release: over 1 year ago - 1 dependent repositories - 2,077 stars on GitHub - 1 maintainer
freemusicarchive 0.0.0 💰
Free Music Archive1 version - Latest release: over 1 year ago - 1 dependent repositories - 2,077 stars on GitHub - 1 maintainer
webdata 0.0.1
Publish data on web1 version - Latest release: about 9 years ago - 4 dependent repositories - 57 downloads last month - 1 stars on GitHub - 1 maintainer
geox 0.0.14
GeoX, Geostatic Dataset Integration Tool14 versions - Latest release: over 2 years ago - 526 downloads last month - 3 stars on GitHub - 1 maintainer
lrbenchmark 0.1.2
Benchmarking Likelihood Ratio systems3 versions - Latest release: almost 2 years ago - 1 dependent package - 120 downloads last month - 0 stars on GitHub - 1 maintainer
patent-parsing-tools 0.9.5
patent-parsing-tools is a library providing tools for generating training and test set from Googl...4 versions - Latest release: 4 months ago - 2 dependent repositories - 112 downloads last month - 5 stars on GitHub - 1 maintainer
mafaextractor 0.1.1
Extract label data from the MAFA dataset into a Pandas DataFrame.2 versions - Latest release: over 4 years ago - 1 dependent repositories - 101 downloads last month - 3 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
4 versions - Latest release: almost 4 years ago - 1 dependent package - 4 dependent repositories - 196 downloads last month - 2 maintainers
waymo-open-dataset-tf-2-3-0 1.3.1
Waymo Open Dataset libraries.4 versions - Latest release: almost 4 years ago - 1 dependent package - 4 dependent repositories - 196 downloads last month - 2 maintainers
midv500 0.2.1 💰
Download and convert MIDV-500 annotations to COCO instance segmentation format5 versions - Latest release: over 4 years ago - 1 dependent repositories - 305 downloads last month - 87 stars on GitHub - 1 maintainer
recsys-slates-dataset 1.0.3
Recommender Systems Dataset from FINN.no containing the presented items and whether and what the ...10 versions - Latest release: over 2 years ago - 1 dependent repositories - 533 downloads last month - 52 stars on GitHub - 1 maintainer
dataset-model 0.1.2
Easy model object to work with the dataset framework3 versions - Latest release: almost 12 years ago - 2 dependent repositories - 94 downloads last month - 6 stars on GitHub - 1 maintainer
barecat 0.2.5
Scalable archive format for storing millions of small files with random access and SQLite indexing.1 version - Latest release: 28 days ago - 107 downloads last month - 10 stars on GitHub - 1 maintainer
detection_datasets 0.3.8
Easily load and transform datasets for object detection14 versions - Latest release: over 1 year ago - 563 downloads last month - 8 stars on GitHub - 1 maintainer
causeinfer 1.0.2
Machine learning based causal inference/uplift in Python35 versions - Latest release: almost 3 years ago - 1 dependent repositories - 828 downloads last month - 55 stars on GitHub - 1 maintainer
tweet-getter 1.0.0rc2
Get Tweets by IDs through Twitter's API1 version - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 6 stars on GitHub - 1 maintainer
rdetoolkit 1.2.0
A module that supports the workflow of the RDE dataset construction program9 versions - Latest release: 5 days ago - 6.62 thousand downloads last month - 3 stars on GitHub - 1 maintainer
tinysets 0.0.3
Collection of different datasets2 versions - Latest release: over 4 years ago - 1 dependent repositories - 89 downloads last month - 6 stars on GitHub - 1 maintainer
datasetclient 0.1.3
A simple Python package to interact with a IDARE dataset API.3 versions - Latest release: 2 months ago - 151 downloads last month - 0 stars on gitlab.com - 1 maintainer
transformer-srl 2.5.2
SRL Transformer model74 versions - Latest release: about 3 years ago - 1 dependent repositories - 1.57 thousand downloads last month - 70 stars on GitHub - 1 maintainer
nlp-dataset-readers 0.1.7
Dataset Readers for NLP8 versions - Latest release: over 3 years ago - 1 dependent repositories - 299 downloads last month - 3 stars on GitHub - 1 maintainer
climafeis 1.0.10
Scrape daily climate data from Canal CLIMA7 versions - Latest release: over 1 year ago - 299 downloads last month - 0 stars on GitHub - 1 maintainer
Top 2.6% on pypi.org
35 versions - Latest release: 16 days ago - 4 dependent packages - 61 dependent repositories - 106 thousand downloads last month - 1,270 stars on GitHub - 3 maintainers
mosaicml-streaming 0.12.0
Streaming lets users create PyTorch compatible datasets that can be streamed from cloud-based obj...35 versions - Latest release: 16 days ago - 4 dependent packages - 61 dependent repositories - 106 thousand downloads last month - 1,270 stars on GitHub - 3 maintainers
dlkp 0.0.1
A deep learning library for keyphrase extraction and generation1 version - Latest release: about 3 years ago - 1 dependent repositories - 58 downloads last month - 25 stars on GitHub - 2 maintainers
ego4d 1.7.3
Ego4D Dataset CLI18 versions - Latest release: 11 months ago - 1 dependent repositories - 1.18 thousand downloads last month - 410 stars on GitHub - 2 maintainers
osdata 1.0.0
A Universal SDK for discovering and analyzing research datasets4 versions - Latest release: about 1 year ago - 197 downloads last month - 1 maintainer
pypi-learn 0.0.7
Generates a dataset for the Turkish speech recognition.4 versions - Latest release: about 4 years ago - 1 dependent repositories - 170 downloads last month - 1 maintainer
ldc-doc 0.0.4
Python3 library that adds MS Word .doc support to the llm-dataset-converter library.4 versions - Latest release: about 1 month ago - 220 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-docx 0.0.3
Python3 library that adds MS Word .docx support to the llm-dataset-converter library.3 versions - Latest release: about 1 month ago - 201 downloads last month - 1 stars on GitHub - 1 maintainer
ldc-openai 0.0.1
Python3 library containing plugins for llm-dataset-converter specific to OpenAI.1 version - Latest release: 12 months ago - 45 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-google 0.0.1
Python3 library integrating Google services into the llm-dataset-converter library.1 version - Latest release: 12 months ago - 46 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-html 0.0.3
Python3 library that adds HTML support to the llm-dataset-converter library.3 versions - Latest release: about 1 month ago - 190 downloads last month - 0 stars on GitHub - 1 maintainer
petdatasetreader 0.0.2
Convenient interface that provides structured representations of the PET dataset hosted on Huggin...5 versions - Latest release: 11 months ago - 1 dependent repositories - 110 downloads last month - 1 maintainer
holcrawl 1.0.1
A crawler for building Hollywood movies datsets.2 versions - Latest release: about 8 years ago - 1 dependent repositories - 65 downloads last month - 9 stars on GitHub - 1 maintainer
ficto 0.0.8
A Python package to generate realistic demo data in csv or json format.7 versions - Latest release: 9 months ago - 335 downloads last month - 16 stars on GitHub - 1 maintainer
skyimages 0.0.2
Downloading sky image datasets for pytorch applications2 versions - Latest release: over 2 years ago - 56 downloads last month - 13 stars on GitHub - 1 maintainer
trustllm 0.3.0
TrustLLM7 versions - Latest release: 12 months ago - 320 downloads last month - 544 stars on GitHub - 1 maintainer
pyeeglab 0.10.0
Analyze and manipulate EEG data using PyEEGLab20 versions - Latest release: over 4 years ago - 1 dependent repositories - 400 downloads last month - 61 stars on GitHub - 1 maintainer
smallssd 0.0.5
Open source agricultural data5 versions - Latest release: almost 3 years ago - 1 dependent repositories - 183 downloads last month - 5 stars on GitHub - 1 maintainer
cppe5 0.1.1 💰
A library to easily download, load and work with the CPPE-5 dataset.2 versions - Latest release: about 2 years ago - 1 dependent repositories - 131 downloads last month - 68 stars on GitHub - 2 maintainers
quevedo 1.3.1
Tool for managing datasets of images with compositional semantics6 versions - Latest release: about 3 years ago - 1 dependent repositories - 224 downloads last month - 3 stars on GitHub - 1 maintainer
synthex 0.1.7
Generate high-quality, large-scale synthetic datasets 📊🧪5 versions - Latest release: 6 days ago - 322 downloads last month - 0 stars on GitHub - 1 maintainer
scieloscopus 1.0.0
Library to delivery Scopus and Scimago indicators of SciELO Journals1 version - Latest release: over 7 years ago - 1 dependent repositories - 25 downloads last month - 2 stars on GitHub - 1 maintainer
Related Keywords
python
183
machine-learning
117
deep-learning
97
data
96
datasets
72
pytorch
69
learning
50
data-science
46
tensorflow
44
nlp
42
machine
40
computer-vision
36
python3
36
machine learning
35
natural-language-processing
31
ai
28
csv
24
llm
22
pandas
22
object-detection
21
image
20
benchmark
20
database
20
annotation
18
cli
17
huggingface
17
dataset-generation
17
data-analysis
16
classification
16
download
15
json
15
images
14
models
14
ml
13
segmentation
13
test
12
api
12
image-classification
12
NLP
12
autonomous
12
driving
12
image-processing
12
yolo
12
deep
12
numpy
11
dataloader
11
metadata
11
library
11
text-classification
11
mlops
11
pypi
11
datascience
11
training
11
package
11
deep learning
10
audio
10
visualization
10
coco
10
preprocessing
10
data-mining
10
annotation-tool
10
generator
10
labeling-tool
10
downloader
9
corpus
9
labeling
9
annotations
9
torch
9
neural-networks
9
science
9
bert
9
streaming
9
testing
8
natural language processing
8
data-labeling
8
hacktoberfest
8
jax
8
clustering
8
dataset-manager
8
scraper
8
semantic-segmentation
8
computer vision
8
bioinformatics
8
federated-learning
8
python-package
8
text
8
ocr
8
analytics
8
statistics
7
fake
7
medical
7
transformer
7
scraping
7
graph
7
evaluation
7
sentiment-analysis
7
text-annotation
7
large-language-models
7
PyTorch
7
dataframe
7