Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "dataset" keyword

Top 1.0% on pypi.org
tfds-nightly 4.9.4.dev202401070044
tensorflow/datasets is a library of datasets ready to use with TensorFlow.
1,859 versions - Latest release: 4 months ago - 13 dependent packages - 296 dependent repositories - 1.35 million downloads last month - 4,085 stars on GitHub - 8 maintainers
Top 5.0% on pypi.org
tensorflow-io-nightly 0.31.0.dev20230309180344
TensorFlow IO
902 versions - Latest release: about 1 year ago - 3 dependent repositories - 29.9 thousand downloads last month - 690 stars on GitHub - 5 maintainers
Top 9.4% on pypi.org
torchdatasets-nightly 1711929801
PyTorch based library focused on data processing and input pipelines in general.
850 versions - Latest release: about 2 months ago - 1 dependent repositories - 5.53 thousand downloads last month - 328 stars on GitHub - 1 maintainer
Top 0.3% on pypi.org
faker 25.2.0 💰
Faker is a Python package that generates fake data for you.
361 versions - Latest release: 6 days ago - 382 dependent packages - 15,807 dependent repositories - 15.4 million downloads last month - 16,716 stars on GitHub - 2 maintainers
Top 8.6% on pypi.org
fastdup 1.123
Fast tool for gaining insights from large image repositories.
332 versions - Latest release: 13 days ago - 1 dependent repositories - 3.97 thousand downloads last month - 1,410 stars on GitHub - 4 maintainers
Top 1.2% on pypi.org
whylogs 1.4.0
Profile and monitor your ML data pipeline end-to-end
312 versions - Latest release: 5 days ago - 6 dependent packages - 413 dependent repositories - 441 thousand downloads last month - 2,482 stars on GitHub - 4 maintainers
ocf-datapipes 3.3.24 💰
Pytorch Datapipes built for use in Open Climate Fix's forecasting work
244 versions - Latest release: 9 days ago - 3 dependent packages - 2.85 thousand downloads last month - 10 stars on GitHub - 2 maintainers
Top 1.6% on pypi.org
label-studio 1.12.0
Label Studio annotation tool
184 versions - Latest release: about 1 month ago - 1 dependent package - 39 dependent repositories - 42.6 thousand downloads last month - 15,269 stars on GitHub - 1 maintainer
Top 7.7% on pypi.org
starwhale 0.6.13
An MLOps Platform for Model Evaluation
181 versions - Latest release: 4 months ago - 6 dependent repositories - 428 downloads last month - 187 stars on GitHub - 1 maintainer
zengin-code 1.1.0.20240415 💰
bank codes and branch codes for Japanese.
161 versions - Latest release: about 1 month ago - 1 dependent repositories - 5.34 thousand downloads last month - 13 stars on GitHub - 2 maintainers
Top 4.9% on pypi.org
tensorflow-io-gcs-filesystem-nightly 0.31.0.dev20230309180344
TensorFlow IO
160 versions - Latest release: about 1 year ago - 1 dependent package - 3 dependent repositories - 9.33 thousand downloads last month - 690 stars on GitHub - 3 maintainers
Top 8.5% on pypi.org
segments-ai 1.8.1
Segments.ai Python SDK
131 versions - Latest release: 5 days ago - 1 dependent package - 3 dependent repositories - 4.21 thousand downloads last month - 20 stars on GitHub - 1 maintainer
opedia 0.2.82
Opedia is an open source database service to integrate, visualize, and analyze ocean datasets suc...
129 versions - Latest release: almost 5 years ago - 3 dependent repositories - 476 downloads last month - 10 stars on GitHub - 1 maintainer
open-geodata 22.6.59
Dados Espaciais do Brasil
122 versions - Latest release: about 1 year ago - 2 dependent packages - 3 dependent repositories - 281 downloads last month - 2 stars on GitHub - 1 maintainer
Top 2.2% on pypi.org
pytest-cases 3.8.5
Separate test code from test cases in pytest.
119 versions - Latest release: about 1 month ago - 61 dependent packages - 227 dependent repositories - 318 thousand downloads last month - 326 stars on GitHub - 1 maintainer
classixclustering 1.2.5
Fast and explainable clustering based on sorting
117 versions - Latest release: about 1 month ago - 1 dependent repositories - 464 downloads last month - 81 stars on GitHub - 1 maintainer
Top 2.1% on pypi.org
datalad 1.0.2
data distribution geared toward scientific datasets
114 versions - Latest release: 29 days ago - 43 dependent packages - 78 dependent repositories - 17.9 thousand downloads last month - 489 stars on GitHub - 5 maintainers
Top 3.0% on pypi.org
img2dataset 1.45.0
Easily turn a set of image urls to an image dataset
87 versions - Latest release: 4 months ago - 2 dependent packages - 10 dependent repositories - 39.1 thousand downloads last month - 3,256 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
clip-retrieval 2.44.0
Easily computing clip embeddings and building a clip retrieval system with them
86 versions - Latest release: 4 months ago - 3 dependent repositories - 5.68 thousand downloads last month - 2,163 stars on GitHub - 1 maintainer
datasetrising 1.0.4
Toolchain for creating and training Stable Diffusion models with custom datasets
86 versions - Latest release: 5 months ago - 802 downloads last month - 11 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
opendatalab 0.0.10
OpenDataLab Python SDK
83 versions - Latest release: 10 months ago - 1 dependent package - 1 dependent repositories - 188 thousand downloads last month - 52 stars on GitHub - 2 maintainers
Top 9.6% on pypi.org
tensorbay 1.24.2
Graviti TensorBay Python SDK
74 versions - Latest release: over 1 year ago - 1 dependent package - 2 dependent repositories - 145 downloads last month - 74 stars on GitHub - 1 maintainer
transformer-srl 2.5.2
SRL Transformer model
74 versions - Latest release: about 2 years ago - 1 dependent repositories - 394 downloads last month - 67 stars on GitHub - 1 maintainer
fastdatasets 0.9.17
fastdatasets: datasets for tfrecords
71 versions - Latest release: 7 months ago - 2 dependent packages - 1 dependent repositories - 385 downloads last month - 1 stars on GitHub - 1 maintainer
nlprep 0.2.1
Download and pre-processing data for nlp tasks
70 versions - Latest release: almost 3 years ago - 1 dependent repositories - 635 downloads last month - 28 stars on GitHub - 1 maintainer
datamaestro-text 2024.3.10
Datamaestro module for text-related datasets
65 versions - Latest release: 2 months ago - 1 dependent package - 1 dependent repositories - 610 downloads last month - 3 stars on GitHub - 1 maintainer
starwhale-bootstrap 0.2.2b6
MLOps Platform
65 versions - Latest release: almost 2 years ago - 1 dependent repositories - 672 downloads last month - 187 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
convokit 3.0.0
ConvoKit
64 versions - Latest release: 10 months ago - 15 dependent repositories - 2.93 thousand downloads last month - 512 stars on GitHub - 2 maintainers
datamaestro 1.1.0
"Dataset management command line and API"
64 versions - Latest release: 2 months ago - 2 dependent packages - 5 dependent repositories - 550 downloads last month - 12 stars on GitHub - 1 maintainer
smashed 0.21.5
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields ext...
64 versions - Latest release: 8 months ago - 2 dependent packages - 1 dependent repositories - 11.5 thousand downloads last month - 30 stars on GitHub - 2 maintainers
Top 7.7% on pypi.org
cihai 0.33.0
Library for CJK (chinese, japanese, korean) language data.
62 versions - Latest release: about 1 month ago - 1 dependent package - 7 dependent repositories - 593 downloads last month - 76 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
quandl 3.7.0
Package for quandl API access
62 versions - Latest release: over 2 years ago - 11 dependent packages - 303 dependent repositories - 128 thousand downloads last month - 1,357 stars on GitHub - 6 maintainers
robotathome 1.1.9
This package provides a Python Toolbox with a set of functions to assist in the management of Rob...
60 versions - Latest release: about 1 year ago - 1 dependent repositories - 455 downloads last month - 9 stars on GitHub - 1 maintainer
graphite-datasets 1.0.59
tensorflow/datasets is a library of datasets ready to use with TensorFlow.
60 versions - Latest release: 3 months ago - 391 downloads last month - 4,157 stars on GitHub - 1 maintainer
Top 5.8% on pypi.org
pylabel 0.1.55 💰
Transform, analyze, and visualize computer vision annotations.
57 versions - Latest release: 6 months ago - 1 dependent package - 3 dependent repositories - 2.86 thousand downloads last month - 297 stars on GitHub - 1 maintainer
Top 3.1% on pypi.org
datumaro 1.6.1
Dataset Management Framework (Datumaro)
57 versions - Latest release: 16 days ago - 7 dependent packages - 30 dependent repositories - 11.9 thousand downloads last month - 481 stars on GitHub - 3 maintainers
Top 6.2% on pypi.org
randfacts 0.21.0 💰
Package to generate random facts
56 versions - Latest release: 6 months ago - 5 dependent packages - 24 dependent repositories - 10.3 thousand downloads last month - 18 stars on GitHub - 1 maintainer
graviti 0.13.1
Graviti Python SDK
56 versions - Latest release: about 1 year ago - 1 dependent repositories - 314 downloads last month - 12 stars on GitHub - 1 maintainer
bugzoo 2.1.33
A platform for safely studying historical software versions.
55 versions - Latest release: over 4 years ago - 5 dependent repositories - 229 downloads last month - 64 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
unihan-etl 0.34.0
Export UNIHAN data of Chinese, Japanese, Korean to CSV, JSON or YAML
55 versions - Latest release: about 2 months ago - 3 dependent packages - 7 dependent repositories - 743 downloads last month - 51 stars on GitHub - 1 maintainer
Top 9.2% on pypi.org
datalabs 0.4.15
Datalabs
54 versions - Latest release: over 1 year ago - 3 dependent packages - 2 dependent repositories - 385 downloads last month - 124 stars on GitHub - 2 maintainers
tasknet 1.53.0
Seamless integration of tasks with huggingface models
54 versions - Latest release: 3 months ago - 346 downloads last month - 30 stars on GitHub - 1 maintainer
Top 2.9% on pypi.org
dataprofiler 0.10.9
What is in your data? Detect schema, statistics and entities in almost any file.
53 versions - Latest release: 2 months ago - 2 dependent packages - 28 dependent repositories - 18.8 thousand downloads last month - 1,363 stars on GitHub - 2 maintainers
Top 5.4% on pypi.org
cpi 1.1.5 💰
Quickly adjust U.S. dollars for inflation using the Consumer Price Index (CPI)
52 versions - Latest release: 4 days ago - 1 dependent package - 21 dependent repositories - 21.5 thousand downloads last month - 127 stars on GitHub - 1 maintainer
geode-ml 2.7.2
Classes and methods to help with the creation of geospatial training datasets and deep-learning m...
52 versions - Latest release: 10 months ago - 216 downloads last month - 0 stars on GitHub - 1 maintainer
cihai-cli 0.28.0
Command line frontend for the cihai CJK language library
51 versions - Latest release: about 1 month ago - 1 dependent package - 1 dependent repositories - 488 downloads last month - 6 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
continuum 1.2.7
A clean and simple library for Continual Learning in PyTorch.
51 versions - Latest release: over 1 year ago - 11 dependent repositories - 693 downloads last month - 400 stars on GitHub - 2 maintainers
edzipdataset 0.1.49
EDZip-based, S3-hosted pytorch dataset
50 versions - Latest release: 6 months ago - 109 downloads last month - 0 stars on GitHub - 1 maintainer
wildlife-datasets 1.0.3
Library for easier access and research of wildlife re-identification datasets
49 versions - Latest release: 10 days ago - 1 dependent package - 523 downloads last month - 42 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
mirdata 0.3.8
Common loaders for MIR datasets.
46 versions - Latest release: 7 months ago - 3 dependent packages - 5 dependent repositories - 1.67 thousand downloads last month - 344 stars on GitHub - 3 maintainers
proteinflow 2.8.0
Versatile pipeline for processing protein structure data for deep learning applications.
45 versions - Latest release: 3 months ago - 259 downloads last month - 165 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
tensorflow-io 0.37.0
TensorFlow IO
44 versions - Latest release: 18 days ago - 19 dependent packages - 293 dependent repositories - 3.89 million downloads last month - 690 stars on GitHub - 6 maintainers
randomlib 4.5
An NLP Library for Marathi Language
44 versions - Latest release: about 1 year ago - 117 downloads last month - 87 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
torchxrayvision 1.2.3 💰
TorchXRayVision: A library of chest X-ray datasets and models
43 versions - Latest release: 5 days ago - 1 dependent package - 31 dependent repositories - 13.6 thousand downloads last month - 775 stars on GitHub - 2 maintainers
dytb 0.7.4 💰
Simplify the trainining and tuning of Tensorflow models
42 versions - Latest release: about 6 years ago - 1 dependent repositories - 179 downloads last month - 214 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
scipp 24.2.0
Multi-dimensional data arrays with labeled dimensions
42 versions - Latest release: 3 months ago - 12 dependent packages - 3 dependent repositories - 9.94 thousand downloads last month - 106 stars on GitHub - 3 maintainers
dictabase 4.0.16
A database interface that mimics a python dictionary.
41 versions - Latest release: almost 4 years ago - 1 dependent package - 2 dependent repositories - 10 downloads last month - 5 stars on GitHub - 1 maintainer
ddf-utils 1.0.14
Commonly used functions/utilities for DDF file model.
39 versions - Latest release: over 2 years ago - 1 dependent repositories - 321 downloads last month - 2 stars on GitHub - 1 maintainer
fdi 1.34.2
Flexible Data Integrator
39 versions - Latest release: 11 months ago - 1 dependent repositories - 422 downloads last month - 1 maintainer
Top 8.9% on pypi.org
audb 1.7.2
Load and publish databases in audformat
38 versions - Latest release: 3 days ago - 1 dependent package - 4 dependent repositories - 2.39 thousand downloads last month - 20 stars on GitHub - 1 maintainer
stringzilla 3.8.3
SIMD-accelerated string search, sort, hashes, fingerprints, & edit distances
37 versions - Latest release: 22 days ago - 1 dependent package - 23.8 thousand downloads last month - 1,749 stars on GitHub - 1 maintainer
mumin 1.10.0
Seamlessly build the MuMiN dataset.
36 versions - Latest release: almost 2 years ago - 1 dependent repositories - 205 downloads last month - 27 stars on GitHub - 1 maintainer
mlstructfp 0.5.7
Machine learning structural floor plan dataset
36 versions - Latest release: about 2 months ago - 250 downloads last month - 8 stars on GitHub - 1 maintainer
lokii 1.1.6
Generate, Load, Develop and Test with consistent relational datasets!
35 versions - Latest release: 11 months ago - 1 dependent repositories - 85 downloads last month - 1 stars on GitHub - 1 maintainer
causeinfer 1.0.2
Machine learning based causal inference/uplift in Python
35 versions - Latest release: almost 2 years ago - 1 dependent repositories - 131 downloads last month - 55 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
cvat-sdk 2.13.0
CVAT REST API
35 versions - Latest release: 10 days ago - 4 dependent packages - 54 dependent repositories - 13.9 thousand downloads last month - 11,417 stars on GitHub - 3 maintainers
connectome 0.10.0
A library for datasets containing heterogeneous data
34 versions - Latest release: about 1 month ago - 1 dependent repositories - 246 downloads last month - 12 stars on GitHub - 2 maintainers
Top 6.8% on pypi.org
cvat-cli 2.13.0
Command-line client for CVAT
34 versions - Latest release: 10 days ago - 1 dependent repositories - 1.59 thousand downloads last month - 11,417 stars on GitHub - 3 maintainers
delia 1.2.7
DICOM Extraction for Large-scale Image Analysis (DELIA).
33 versions - Latest release: 5 months ago - 1 dependent package - 264 downloads last month - 12 stars on GitHub - 1 maintainer
Top 0.6% on pypi.org
tensorflow-datasets 4.9.4
tensorflow/datasets is a library of datasets ready to use with TensorFlow.
33 versions - Latest release: 5 months ago - 116 dependent packages - 3,946 dependent repositories - 4.14 million downloads last month - 4,085 stars on GitHub - 8 maintainers
Top 0.7% on pypi.org
torchtext 0.18.0
Text utilities, models, transforms, and datasets for PyTorch.
33 versions - Latest release: 25 days ago - 92 dependent packages - 2,976 dependent repositories - 705 thousand downloads last month - 3,450 stars on GitHub - 4 maintainers
imagepreprocessing 1.7.2
image preprocessing
32 versions - Latest release: over 3 years ago - 1 dependent repositories - 138 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
crowsetta 5.0.2
A Python tool to work with any format for annotating animal vocalizations and bioacoustics data
32 versions - Latest release: 4 months ago - 4 dependent packages - 5 dependent repositories - 615 downloads last month - 48 stars on GitHub - 1 maintainer
pytorch-datastream 0.4.10
Simple dataset to dataloader library for pytorch
31 versions - Latest release: 6 months ago - 1 dependent package - 6 dependent repositories - 349 downloads last month - 31 stars on GitHub - 1 maintainer
Top 4.0% on pypi.org
doccano 1.8.4 💰
doccano, text annotation tool for machine learning practitioners
31 versions - Latest release: 10 months ago - 7 dependent repositories - 10.3 thousand downloads last month - 8,436 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
pix2tex 0.1.2
pix2tex: Using a ViT to convert images of equations into LaTeX code.
31 versions - Latest release: 12 months ago - 2 dependent packages - 5 dependent repositories - 4.58 thousand downloads last month - 10,900 stars on GitHub - 1 maintainer
sgs 2.1.1
Python wrapper para o webservice do SGS - Sistema Gerenciador de Series Temporais do Banco Centra...
30 versions - Latest release: over 2 years ago - 1 dependent repositories - 2.67 thousand downloads last month - 72 stars on GitHub - 1 maintainer
newssentiment 1.2.28 💰
Easy-to-use, high-quality target-dependent sentiment classification for English news articles
30 versions - Latest release: 5 months ago - 1 dependent repositories - 533 downloads last month - 132 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
beir 2.0.0
A Heterogeneous Benchmark for Information Retrieval
29 versions - Latest release: 10 months ago - 9 dependent packages - 30 dependent repositories - 168 thousand downloads last month - 1,370 stars on GitHub - 1 maintainer
pietoolbelt 0.3.26
Toolbelt for PiePline training pipeline
29 versions - Latest release: over 2 years ago - 2 dependent repositories - 205 downloads last month - 1 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
tensorflow-io-plugin-gs-nightly 0.18.0.dev20210513213318
TensorFlow IO
29 versions - Latest release: about 3 years ago - 1 dependent package - 1 dependent repositories - 1.23 thousand downloads last month - 690 stars on GitHub - 4 maintainers
tsp 1.7.3
Making permafrost data effortless
29 versions - Latest release: 8 months ago - 12 dependent repositories - 395 downloads last month - 5 stars on GitLab.com - 1 maintainer
cocorepr 0.1.0
COCO Dataset cleaning tool
29 versions - Latest release: about 3 years ago - 62 downloads last month - 2 stars on GitHub - 1 maintainer
lakeapi 0.14.0
API for accessing Lake crypto market data
28 versions - Latest release: 17 days ago - 1 dependent repositories - 2.97 thousand downloads last month - 19 stars on GitHub - 1 maintainer
Top 2.6% on pypi.org
mosaicml-streaming 0.7.6
Streaming lets users create PyTorch compatible datasets that can be streamed from cloud-based obj...
27 versions - Latest release: 8 days ago - 4 dependent packages - 61 dependent repositories - 61.5 thousand downloads last month - 703 stars on GitHub - 5 maintainers
Top 3.8% on pypi.org
cryptocmd 0.6.4 💰
Cryptocurrency historical market price data scrapper.
27 versions - Latest release: 7 months ago - 3 dependent packages - 21 dependent repositories - 1.74 thousand downloads last month - 523 stars on GitHub - 1 maintainer
pipelime-python 1.9.1
Data workflows, cli and dataflow automation.
27 versions - Latest release: 5 months ago - 1 dependent repositories - 321 downloads last month - 18 stars on GitHub - 1 maintainer
mddatasetbuilder 1.3.8
A script to generate molecular dynamics (MD) datasets for machine learning from given LAMMPS traj...
27 versions - Latest release: 8 months ago - 1 dependent repositories - 282 downloads last month - 36 stars on GitHub - 1 maintainer
Top 4.5% on pypi.org
sapien 2.2.2
['SAPIEN: A SimulAted Parted based Interactive ENvironment']
26 versions - Latest release: 10 months ago - 4 dependent packages - 8 dependent repositories - 2.48 thousand downloads last month - 1 maintainer
tehran-stocks 2.0.0
Data Downloader for Tehran stock market
26 versions - Latest release: 10 months ago - 1 dependent repositories - 405 downloads last month - 450 stars on GitHub - 1 maintainer
dao-scripts 1.2.2
"A tool to download data to monitor DAO activity"
25 versions - Latest release: 5 months ago - 1 dependent package - 450 downloads last month - 0 stars on GitHub - 2 maintainers
Top 9.0% on pypi.org
omrdatasettools 1.4.0
A collection of tools that simplify the downloading and handling of datasets used for Optical Mus...
25 versions - Latest release: 7 months ago - 11 dependent repositories - 90 downloads last month - 279 stars on GitHub - 1 maintainer
datatap 0.3.0
Client library for dataTap
25 versions - Latest release: almost 2 years ago - 1 dependent repositories - 148 downloads last month - 34 stars on GitHub - 1 maintainer
Top 5.8% on pypi.org
datadotworld 2.0.0
Python library for data.world
25 versions - Latest release: 30 days ago - 26 dependent repositories - 42.5 thousand downloads last month - 100 stars on GitHub - 1 maintainer
Top 7.2% on pypi.org
xarray-dataclasses 1.7.0
xarray data creation made easy by dataclass
25 versions - Latest release: 7 months ago - 3 dependent packages - 8 dependent repositories - 14.1 thousand downloads last month - 62 stars on GitHub - 1 maintainer
ds-format 4.1.0
ds-format is an open source program, a Python package and a storage format which provides an inte...
24 versions - Latest release: 3 months ago - 4 dependent repositories - 220 downloads last month - 1 stars on GitHub - 1 maintainer
Top 2.2% on pypi.org
colour-science 0.4.4 💰
Colour Science for Python
23 versions - Latest release: 5 months ago - 23 dependent packages - 94 dependent repositories - 99.2 thousand downloads last month - 1,969 stars on GitHub - 1 maintainer
canrevan 4.0.0
Simple Naver News Crawler
23 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 148 downloads last month - 86 stars on GitHub - 1 maintainer
wikipedia-ner 0.0.24
Python package for creating labeled examples from wiki dumps
22 versions - Latest release: about 9 years ago - 3 dependent repositories - 93 downloads last month - 68 stars on GitHub - 1 maintainer
starintel-doc 0.8.0
Document Spec for Star intel
22 versions - Latest release: over 1 year ago - 1 dependent repositories - 53 downloads last month - 0 stars on GitLab.com - 1 maintainer