An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "dataset" keyword

View the packages on the pypi.org package registry that are tagged with the "dataset" keyword.

clloader 0.0.2
A DataLoader library for Continual Learning in PyTorch.
2 versions - Latest release: about 5 years ago - 82 downloads last month - 428 stars on GitHub - 1 maintainer
waymo-open-dataset-tf-2-2-0 1.3.1
Waymo Open Dataset libraries.
3 versions - Latest release: almost 4 years ago - 2 dependent packages - 5 dependent repositories - 333 downloads last month - 2 maintainers
pyautoplot 1.0.1
PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by gene...
2 versions - Latest release: 3 months ago - 123 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
path-dict 4.0.0
Extends Python's dict with useful extras
20 versions - Latest release: about 2 years ago - 1 dependent package - 3 dependent repositories - 6.26 thousand downloads last month - 25 stars on GitHub - 1 maintainer
ecko-cli 1.6.0
CLI tool that easily converts a directory of images into a dataset for training generative ai models
8 versions - Latest release: 5 months ago - 345 downloads last month - 1 stars on GitHub - 1 maintainer
spltr 0.3.2
A simple PyTorch-based data loader and splitter
3 versions - Latest release: over 5 years ago - 1 dependent repositories - 138 downloads last month - 1 stars on GitHub - 1 maintainer
rads 0.1.0
Python front end for the Radar Altimeter Database System.
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 95 downloads last month - 2 stars on GitHub - 1 maintainer
graviti 0.13.1
Graviti Python SDK
56 versions - Latest release: about 2 years ago - 1 dependent repositories - 1.61 thousand downloads last month - 12 stars on GitHub - 1 maintainer
ecodatatk 0.0.1
Developed for limnological and hydrological studies
1 version - Latest release: almost 5 years ago - 1 dependent repositories - 56 downloads last month - 0 stars on GitHub - 1 maintainer
factuality 1.0.14
Benchmarking long-form factuality in large language models. Original code for our paper "Long-for...
4 versions - Latest release: about 1 year ago - 177 downloads last month - 575 stars on GitHub - 1 maintainer
dgit_extensions 0.1.3
dgit addons
2 versions - Latest release: about 9 years ago - 2 dependent repositories - 57 downloads last month - 0 stars on GitHub - 1 maintainer
chrome-fingerprints 1.1
A Collection of 10.000 self-collected Chrome Fingerprints. Wrapped in a easy-to-use API, availabl...
2 versions - Latest release: over 1 year ago - 1 dependent package - 1.53 thousand downloads last month - 220 stars on GitHub - 1 maintainer
target-datadotworld 1.0.1
Singer target for data.world
5 versions - Latest release: almost 7 years ago - 1 dependent repositories - 154 downloads last month - 5 stars on GitHub - 1 maintainer
soramimi-phonetic-search-dataset 0.0.9
音韻的類似性を考慮した検索システムの評価用データセット。替え歌の歌詞から構築された特定ジャンルの単語ペアを収録。
2 versions - Latest release: about 1 month ago - 123 downloads last month - 0 stars on GitHub - 1 maintainer
mldatasetbuilder 1.0.0
MLDatasetBuilder is a python package which is helping to prepare the image for your ML dataset.
10 versions - Latest release: almost 5 years ago - 1 dependent repositories - 260 downloads last month - 4 stars on GitHub - 1 maintainer
biobeee 0.0.5
Bioinformatics tool for performing web scrapping on biological database and pre-processing
3 versions - Latest release: over 1 year ago - 71 downloads last month - 0 stars on gitlab.com - 1 maintainer
idsprites 1.0.1
Easily generate simple continual learning benchmarks.
1 version - Latest release: 9 months ago - 65 downloads last month - 495 stars on GitHub - 1 maintainer
arrowtextclassifier 1.0.3
ArrowTextClassifier is a simple text classification tool written in pytorch that allows you to tr...
4 versions - Latest release: 12 months ago - 197 downloads last month - 1 maintainer
gps2var 0.1.0a1
Fast reading of geospatial variables by GPS coordinates
2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 91 downloads last month - 7 stars on GitHub - 1 maintainer
get-random-people 1.1.3
Generates random information of a person
12 versions - Latest release: over 2 years ago - 514 downloads last month - 0 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
mirdata 0.3.9
Common loaders for MIR datasets.
47 versions - Latest release: 5 months ago - 3 dependent packages - 5 dependent repositories - 2.92 thousand downloads last month - 375 stars on GitHub - 3 maintainers
Top 2.2% on pypi.org
colour-science 0.4.6 💰
Colour Science for Python
25 versions - Latest release: 6 months ago - 23 dependent packages - 94 dependent repositories - 332 thousand downloads last month - 2,237 stars on GitHub - 1 maintainer
datapush 1.0.1
MySQL Data Generator
3 versions - Latest release: 5 months ago - 137 downloads last month - 2 stars on GitHub - 1 maintainer
sqlstring 0.1.2
sqlstring creates SQL query strings.
1 version - Latest release: almost 9 years ago - 2 dependent repositories - 24 downloads last month - 1 stars on GitHub - 1 maintainer
glossapi 0.0.10
A library for processing academic texts in Greek and other languages
7 versions - Latest release: 4 days ago - 686 downloads last month - 101 stars on GitHub - 2 maintainers
pascal-voc 2.1.12
Tool to work with annotation formats
19 versions - Latest release: 3 months ago - 1 dependent repositories - 836 downloads last month - 6 stars on GitHub - 1 maintainer
hirundo 0.1.9
This package is used to interface with Hirundo's platform. It provides a simple API to optimize y...
8 versions - Latest release: 4 months ago - 332 downloads last month - 2 stars on GitHub - 2 maintainers
coarij 0.2.8
Corpus of Annual Reports in Japan
9 versions - Latest release: over 4 years ago - 1 dependent repositories - 225 downloads last month - 90 stars on GitHub - 1 maintainer
chafic 0.1.10
chakki Financial Report Corpus
11 versions - Latest release: over 5 years ago - 288 downloads last month - 90 stars on GitHub - 1 maintainer
mtgdata 0.3.0
MTG image dataset with automatic image scraping and conversion.
9 versions - Latest release: 4 days ago - 1 dependent repositories - 520 downloads last month - 4 stars on GitHub - 1 maintainer
dldummygen 0.0.2 💰
Deep-Learning Dummy File Generator by csv File
1 version - Latest release: over 4 years ago - 1 dependent repositories - 66 downloads last month - 17,636 stars on GitHub - 1 maintainer
knyfe 0.4.2
A utility for rapid exploration and preprocessing of datasets.
1 version - Latest release: almost 13 years ago - 2 dependent repositories - 65 downloads last month - 54 stars on GitHub - 1 maintainer
mahanlp 0.0.2
An NLP Library for Marathi Language
11 versions - Latest release: over 2 years ago - 302 downloads last month - 118 stars on GitHub - 1 maintainer
randomlib 4.5
An NLP Library for Marathi Language
44 versions - Latest release: about 2 years ago - 840 downloads last month - 102 stars on GitHub - 1 maintainer
demomarlib 0.0.3
An NLP Library for Marathi Language
3 versions - Latest release: over 2 years ago - 85 downloads last month - 118 stars on GitHub - 1 maintainer
kyoushi-dataset 0.2.1
Tool for labeling log data from testbeds
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 125 downloads last month - 2 stars on GitHub - 1 maintainer
hfdl 0.4.0
Fast and reliable downloader for Hugging Face models and datasets
4 versions - Latest release: about 1 month ago - 98 downloads last month - 1 stars on GitHub - 1 maintainer
doccano-multi-label 1.8.4.2 💰
doccano, text annotation tool for machine learning practitioners
2 versions - Latest release: 8 months ago - 79 downloads last month - 9,910 stars on GitHub - 1 maintainer
Top 4.0% on pypi.org
doccano 1.8.4 💰
doccano, text annotation tool for machine learning practitioners
31 versions - Latest release: over 1 year ago - 7 dependent repositories - 17.7 thousand downloads last month - 8,436 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
flwr-datasets 0.5.0
Flower Datasets
7 versions - Latest release: 4 months ago - 7 dependent repositories - 16.9 thousand downloads last month - 4,839 stars on GitHub - 2 maintainers
Top 3.5% on pypi.org
pix2tex 0.1.4
pix2tex: Using a ViT to convert images of equations into LaTeX code.
33 versions - Latest release: 3 months ago - 2 dependent packages - 5 dependent repositories - 5.98 thousand downloads last month - 13,990 stars on GitHub - 1 maintainer
dataset-encoder 1.0.0
Transform your preprocessing stage into a fully automated process!
1 version - Latest release: 11 months ago - 63 downloads last month - 0 stars on GitHub - 1 maintainer
dbrecord 1.1.4
sqlite based kv database using for big data IO.
10 versions - Latest release: over 2 years ago - 1 dependent package - 3 dependent repositories - 464 downloads last month - 6 stars on GitHub - 1 maintainer
pipelime-python 2.0.0
Data workflows, cli and dataflow automation.
28 versions - Latest release: about 1 month ago - 1 dependent repositories - 1.05 thousand downloads last month - 19 stars on GitHub - 1 maintainer
hasy 0.3.1 💰
Tools for the HASY dataset.
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 179 downloads last month - 33 stars on GitHub - 1 maintainer
databalancer 0.2.0
Databalancer is the python library dedicated to balance the imbalanced text classification datase...
21 versions - Latest release: almost 3 years ago - 1 dependent repositories - 125 downloads last month - 7 stars on GitHub - 1 maintainer
dataset-translator 0.1.5
⚡️ Efficient dataset translation using Google Translate's API
6 versions - Latest release: 2 months ago - 261 downloads last month - 1 stars on GitHub - 1 maintainer
agentchef 0.2.7
Comprehensive toolkit for conversation dataset creation, augmentation, and analysis
10 versions - Latest release: 5 days ago - 850 downloads last month - 1 stars on GitHub - 1 maintainer
caffe2_db_image_writer 0.1
caffe2 image writer
1 version - Latest release: almost 8 years ago - 32 downloads last month - 1 maintainer
zosedit 0.0.15
FTP-based MVS Dataset Editor
15 versions - Latest release: 6 months ago - 552 downloads last month - 0 stars on GitHub - 1 maintainer
pardata 0.4.0
A Python API that enables data consumers and distributors to easily use and share datasets, and e...
2 versions - Latest release: over 3 years ago - 2 dependent repositories - 187 downloads last month - 17 stars on GitHub - 1 maintainer
doccano-transformer 1.0.2
Format transformer tool for doccano
3 versions - Latest release: over 4 years ago - 1 dependent repositories - 99 downloads last month - 107 stars on GitHub - 1 maintainer
bioimageloader 0.1.1
load bioimages for machine learning
12 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 368 downloads last month - 7 stars on GitHub - 1 maintainer
dsrl 0.1.0
Datasets for Offline Safe Reinforcement Learning
1 version - Latest release: almost 2 years ago - 1 dependent package - 316 downloads last month - 88 stars on GitHub - 1 maintainer
tf-datasets 0.0.1
tensorflow/datasets
1 version - Latest release: over 6 years ago - 1 dependent repositories - 73 downloads last month - 4,392 stars on GitHub - 1 maintainer
graphite-datasets 1.0.59
tensorflow/datasets is a library of datasets ready to use with TensorFlow.
60 versions - Latest release: about 1 year ago - 1.78 thousand downloads last month - 4,392 stars on GitHub - 1 maintainer
symjax 0.5.0
A Symbolic JAX software
12 versions - Latest release: over 4 years ago - 1 dependent repositories - 241 downloads last month - 119 stars on GitHub - 1 maintainer
rstojnic-tfds-nightly 4.6.0.dev202206140947
tensorflow/datasets is a library of datasets ready to use with TensorFlow.
3 versions - Latest release: almost 3 years ago - 124 downloads last month - 4,084 stars on GitHub - 1 maintainer
hscitorchutil 0.3.1
HSCI research group utilities for pytorch (lightning)
36 versions - Latest release: 3 months ago - 1.45 thousand downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
freemusicarchive 0.0.0 💰
Free Music Archive
1 version - Latest release: over 1 year ago - 1 dependent repositories - 2,077 stars on GitHub - 1 maintainer
webdata 0.0.1
Publish data on web
1 version - Latest release: about 9 years ago - 4 dependent repositories - 57 downloads last month - 1 stars on GitHub - 1 maintainer
geox 0.0.14
GeoX, Geostatic Dataset Integration Tool
14 versions - Latest release: over 2 years ago - 526 downloads last month - 3 stars on GitHub - 1 maintainer
lrbenchmark 0.1.2
Benchmarking Likelihood Ratio systems
3 versions - Latest release: almost 2 years ago - 1 dependent package - 120 downloads last month - 0 stars on GitHub - 1 maintainer
patent-parsing-tools 0.9.5
patent-parsing-tools is a library providing tools for generating training and test set from Googl...
4 versions - Latest release: 4 months ago - 2 dependent repositories - 112 downloads last month - 5 stars on GitHub - 1 maintainer
mafaextractor 0.1.1
Extract label data from the MAFA dataset into a Pandas DataFrame.
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 101 downloads last month - 3 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
waymo-open-dataset-tf-2-3-0 1.3.1
Waymo Open Dataset libraries.
4 versions - Latest release: almost 4 years ago - 1 dependent package - 4 dependent repositories - 196 downloads last month - 2 maintainers
midv500 0.2.1 💰
Download and convert MIDV-500 annotations to COCO instance segmentation format
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 305 downloads last month - 87 stars on GitHub - 1 maintainer
recsys-slates-dataset 1.0.3
Recommender Systems Dataset from FINN.no containing the presented items and whether and what the ...
10 versions - Latest release: over 2 years ago - 1 dependent repositories - 533 downloads last month - 52 stars on GitHub - 1 maintainer
dataset-model 0.1.2
Easy model object to work with the dataset framework
3 versions - Latest release: almost 12 years ago - 2 dependent repositories - 94 downloads last month - 6 stars on GitHub - 1 maintainer
barecat 0.2.5
Scalable archive format for storing millions of small files with random access and SQLite indexing.
1 version - Latest release: 28 days ago - 107 downloads last month - 10 stars on GitHub - 1 maintainer
detection_datasets 0.3.8
Easily load and transform datasets for object detection
14 versions - Latest release: over 1 year ago - 563 downloads last month - 8 stars on GitHub - 1 maintainer
causeinfer 1.0.2
Machine learning based causal inference/uplift in Python
35 versions - Latest release: almost 3 years ago - 1 dependent repositories - 828 downloads last month - 55 stars on GitHub - 1 maintainer
tweet-getter 1.0.0rc2
Get Tweets by IDs through Twitter's API
1 version - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 6 stars on GitHub - 1 maintainer
rdetoolkit 1.2.0
A module that supports the workflow of the RDE dataset construction program
9 versions - Latest release: 5 days ago - 6.62 thousand downloads last month - 3 stars on GitHub - 1 maintainer
tinysets 0.0.3
Collection of different datasets
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 89 downloads last month - 6 stars on GitHub - 1 maintainer
datasetclient 0.1.3
A simple Python package to interact with a IDARE dataset API.
3 versions - Latest release: 2 months ago - 151 downloads last month - 0 stars on gitlab.com - 1 maintainer
transformer-srl 2.5.2
SRL Transformer model
74 versions - Latest release: about 3 years ago - 1 dependent repositories - 1.57 thousand downloads last month - 70 stars on GitHub - 1 maintainer
nlp-dataset-readers 0.1.7
Dataset Readers for NLP
8 versions - Latest release: over 3 years ago - 1 dependent repositories - 299 downloads last month - 3 stars on GitHub - 1 maintainer
climafeis 1.0.10
Scrape daily climate data from Canal CLIMA
7 versions - Latest release: over 1 year ago - 299 downloads last month - 0 stars on GitHub - 1 maintainer
Top 2.6% on pypi.org
mosaicml-streaming 0.12.0
Streaming lets users create PyTorch compatible datasets that can be streamed from cloud-based obj...
35 versions - Latest release: 16 days ago - 4 dependent packages - 61 dependent repositories - 106 thousand downloads last month - 1,270 stars on GitHub - 3 maintainers
dlkp 0.0.1
A deep learning library for keyphrase extraction and generation
1 version - Latest release: about 3 years ago - 1 dependent repositories - 58 downloads last month - 25 stars on GitHub - 2 maintainers
ego4d 1.7.3
Ego4D Dataset CLI
18 versions - Latest release: 11 months ago - 1 dependent repositories - 1.18 thousand downloads last month - 410 stars on GitHub - 2 maintainers
osdata 1.0.0
A Universal SDK for discovering and analyzing research datasets
4 versions - Latest release: about 1 year ago - 197 downloads last month - 1 maintainer
pypi-learn 0.0.7
Generates a dataset for the Turkish speech recognition.
4 versions - Latest release: about 4 years ago - 1 dependent repositories - 170 downloads last month - 1 maintainer
ldc-doc 0.0.4
Python3 library that adds MS Word .doc support to the llm-dataset-converter library.
4 versions - Latest release: about 1 month ago - 220 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-docx 0.0.3
Python3 library that adds MS Word .docx support to the llm-dataset-converter library.
3 versions - Latest release: about 1 month ago - 201 downloads last month - 1 stars on GitHub - 1 maintainer
ldc-openai 0.0.1
Python3 library containing plugins for llm-dataset-converter specific to OpenAI.
1 version - Latest release: 12 months ago - 45 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-google 0.0.1
Python3 library integrating Google services into the llm-dataset-converter library.
1 version - Latest release: 12 months ago - 46 downloads last month - 0 stars on GitHub - 1 maintainer
ldc-html 0.0.3
Python3 library that adds HTML support to the llm-dataset-converter library.
3 versions - Latest release: about 1 month ago - 190 downloads last month - 0 stars on GitHub - 1 maintainer
petdatasetreader 0.0.2
Convenient interface that provides structured representations of the PET dataset hosted on Huggin...
5 versions - Latest release: 11 months ago - 1 dependent repositories - 110 downloads last month - 1 maintainer
holcrawl 1.0.1
A crawler for building Hollywood movies datsets.
2 versions - Latest release: about 8 years ago - 1 dependent repositories - 65 downloads last month - 9 stars on GitHub - 1 maintainer
ficto 0.0.8
A Python package to generate realistic demo data in csv or json format.
7 versions - Latest release: 9 months ago - 335 downloads last month - 16 stars on GitHub - 1 maintainer
skyimages 0.0.2
Downloading sky image datasets for pytorch applications
2 versions - Latest release: over 2 years ago - 56 downloads last month - 13 stars on GitHub - 1 maintainer
trustllm 0.3.0
TrustLLM
7 versions - Latest release: 12 months ago - 320 downloads last month - 544 stars on GitHub - 1 maintainer
pyeeglab 0.10.0
Analyze and manipulate EEG data using PyEEGLab
20 versions - Latest release: over 4 years ago - 1 dependent repositories - 400 downloads last month - 61 stars on GitHub - 1 maintainer
smallssd 0.0.5
Open source agricultural data
5 versions - Latest release: almost 3 years ago - 1 dependent repositories - 183 downloads last month - 5 stars on GitHub - 1 maintainer
cppe5 0.1.1 💰
A library to easily download, load and work with the CPPE-5 dataset.
2 versions - Latest release: about 2 years ago - 1 dependent repositories - 131 downloads last month - 68 stars on GitHub - 2 maintainers
quevedo 1.3.1
Tool for managing datasets of images with compositional semantics
6 versions - Latest release: about 3 years ago - 1 dependent repositories - 224 downloads last month - 3 stars on GitHub - 1 maintainer
synthex 0.1.7
Generate high-quality, large-scale synthetic datasets 📊🧪
5 versions - Latest release: 6 days ago - 322 downloads last month - 0 stars on GitHub - 1 maintainer
scieloscopus 1.0.0
Library to delivery Scopus and Scimago indicators of SciELO Journals
1 version - Latest release: over 7 years ago - 1 dependent repositories - 25 downloads last month - 2 stars on GitHub - 1 maintainer