Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
pypi.org "datacleaning" keyword
medaprep 0.1.1
medaprep is a data preparation and feature engineering toolkit for geospatial applications.2 versions - Latest release: almost 2 years ago - 28 downloads last month - 1 stars on GitHub - 1 maintainer
toolstack 0.1.5
A collection of useful tools to speed-up the data processing, cleaning and pipelining.3 versions - Latest release: over 3 years ago - 1 dependent repositories - 34 downloads last month - 0 stars on GitHub - 1 maintainer
spark-lean 0.3.3
An interactive PySpark-based Data Cleaning Library4 versions - Latest release: about 6 years ago - 1 dependent repositories - 20 downloads last month - 7 stars on GitHub - 2 maintainers
romcrap 0.1
A library that will transform the life of a Data Scientist1 version - Latest release: almost 8 years ago - 2 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
manoelgadifa 0.2
A library that will transform the life of a Data Scientist2 versions - Latest release: almost 8 years ago - 1 dependent repositories - 12 downloads last month - 0 stars on GitHub - 1 maintainer
itallic 0.0.8
Detects potential corrupt entries in a dataframe with lat,lng and country tagged data.8 versions - Latest release: about 3 years ago - 1 dependent repositories - 37 downloads last month - 0 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
19 versions - Latest release: 3 months ago - 1 dependent repositories - 2.43 thousand downloads last month - 319 stars on GitHub - 1 maintainer
hypergbm 0.3.2
A full pipeline AutoML tool integrated various GBM models19 versions - Latest release: 3 months ago - 1 dependent repositories - 2.43 thousand downloads last month - 319 stars on GitHub - 1 maintainer
gotext 0.9.5
GoText is a universal text extraction and preprocessing tool for python which supportss wide vari...2 versions - Latest release: over 2 years ago - 1 dependent repositories - 9 downloads last month - 0 stars on GitHub - 1 maintainer
dirtyclean 0.1
get rid of unicode punctuation and other garbage from strings1 version - Latest release: almost 7 years ago - 1 dependent repositories - 10 downloads last month - 3 stars on GitHub - 1 maintainer
Top 2.8% on pypi.org
33 versions - Latest release: almost 2 years ago - 1 dependent package - 55 dependent repositories - 43.6 thousand downloads last month - 1,902 stars on GitHub - 4 maintainers
dataprep 0.4.5
Dataprep: Data Preparation in Python33 versions - Latest release: almost 2 years ago - 1 dependent package - 55 dependent repositories - 43.6 thousand downloads last month - 1,902 stars on GitHub - 4 maintainers
covid-alberta 0.0.4
This is a small package to look at some of the alberta specific covid data.3 versions - Latest release: about 4 years ago - 1 dependent repositories - 27 downloads last month - 1 stars on GitHub - 1 maintainer
cooka 0.1.5
A lightweight AutoML system.4 versions - Latest release: over 2 years ago - 1 dependent repositories - 84 downloads last month - 319 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
501 versions - Latest release: 11 days ago - 1 dependent repositories - 247 thousand downloads last month - 9,129 stars on GitHub - 4 maintainers
great-expectations-experimental 0.1.20240502061
Always know what to expect from your data.501 versions - Latest release: 11 days ago - 1 dependent repositories - 247 thousand downloads last month - 9,129 stars on GitHub - 4 maintainers
Top 5.3% on pypi.org
8 versions - Latest release: over 2 years ago - 3 dependent packages - 21 dependent repositories - 27.4 thousand downloads last month - 65 stars on GitHub - 1 maintainer
cleantext 1.1.4
An open-source python package to clean raw text data8 versions - Latest release: over 2 years ago - 3 dependent packages - 21 dependent repositories - 27.4 thousand downloads last month - 65 stars on GitHub - 1 maintainer
ricco 1.4.1
A handy ETL&GEOM kit54 versions - Latest release: 2 months ago - 1 dependent repositories - 371 downloads last month - 1 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
264 versions - Latest release: 14 days ago - 42 dependent packages - 284 dependent repositories - 19.3 million downloads last month - 9,420 stars on GitHub - 8 maintainers
great-expectations 0.18.13
Always know what to expect from your data.264 versions - Latest release: 14 days ago - 42 dependent packages - 284 dependent repositories - 19.3 million downloads last month - 9,420 stars on GitHub - 8 maintainers
limpieza 0.1
A library that will clean a dataframe1 version - Latest release: almost 8 years ago - 2 dependent repositories - 8 downloads last month - 0 stars on GitHub - 1 maintainer
great-expectations-cta 0.15.43
Always know what to expect from your data.2 versions - Latest release: over 1 year ago - 1 dependent package - 35 downloads last month - 9,124 stars on GitHub - 1 maintainer
Related Keywords
data-science
6
eda
5
exploratory-data-analysis
5
data
4
python
4
methodselection
3
pipeline-debt
3
mlops
3
exploratorydataanalysis
3
exploratory-analysis
3
data-engineering
3
data-profilers
3
data-profiling
3
data-quality
3
data-unit-tests
3
datacleaner
3
dataunittest
3
science
3
testing
3
nlp
3
pipeline
3
quality
3
dataquality
3
validation
3
datavalidation
3
cleandata
3
pipeline-tests
3
pipeline-testing
3
datascience
3
xgboost
2
tabular-data
2
sklearn
2
semi-supervised-learning
2
rapidsai
2
data science
2
adversarial-validation
2
automl
2
catboost
2
dask
2
dask-distributed
2
distributed-training
2
ensemble-learning
2
fullpipeline
2
gbm
2
gpu-acceleration
2
lightgbm
2
preprocessing
2
pseudo-labeling
2
webscraper
1
alberta
1
canada
1
covid-19
1
covid19-data
1
gis
1
geopandas
1
geometry
1
etl
1
dataprocessing
1
cleantext
1
cleaning-data
1
covid19
1
document extraction
1
text preprocessing
1
text extraction
1
plant-breeding-data
1
data-cleaning-pipeline
1
conda
1
itallic
1
pyspark
1
PySpark
1
machine-learning
1
data-analysis
1
machine learning
1
toolstack
1
data cleaning
1
toolkit
1
xarray
1
webconnector
1
datapreparation
1
dataconnector
1
data-exploration
1
cleaning
1
apiwrapper
1
apis
1
data exploration
1
exploratory data analysis
1
connector
1
dataprep
1
textmining
1
text-preprocessing
1
similarity-score
1
data-preprocessing
1
text utils
1