Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "data-profiling" keyword

Top 0.5% on pypi.org
pandas-profiling 3.6.6
Deprecated 'pandas-profiling' package, use 'ydata-profiling' instead
40 versions - Latest release: over 1 year ago - 46 dependent packages - 1,970 dependent repositories - 738 thousand downloads last month - 12,090 stars on GitHub - 4 maintainers
Top 0.7% on pypi.org
great-expectations 0.18.13
Always know what to expect from your data.
264 versions - Latest release: 17 days ago - 58 dependent packages - 284 dependent repositories - 19.7 million downloads last month - 9,420 stars on GitHub - 8 maintainers
Top 2.1% on pypi.org
sweetviz 2.3.1
A pandas-based library to visualize and compare datasets.
35 versions - Latest release: 6 months ago - 16 dependent packages - 167 dependent repositories - 64.5 thousand downloads last month - 2,824 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
pandas-summary 0.2.0
An extension to pandas describe function.
7 versions - Latest release: over 2 years ago - 4 dependent packages - 137 dependent repositories - 94.4 thousand downloads last month - 492 stars on GitHub - 2 maintainers
Top 1.0% on pypi.org
ydata-profiling 4.8.3
Generate profile report for pandas DataFrame
19 versions - Latest release: 9 days ago - 43 dependent packages - 79 dependent repositories - 1.37 million downloads last month - 11,645 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
pydeequ 1.3.0
PyDeequ - Unit Tests for Data
12 versions - Latest release: 20 days ago - 8 dependent packages - 53 dependent repositories - 11.8 million downloads last month - 649 stars on GitHub - 2 maintainers
Top 2.1% on pypi.org
cleanlab 2.6.4
The standard package for data-centric AI, machine learning with label errors, and automatically f...
29 versions - Latest release: 9 days ago - 11 dependent packages - 19 dependent repositories - 26.2 thousand downloads last month - 8,808 stars on GitHub - 4 maintainers
Top 3.5% on pypi.org
datatile 1.0.3
A library for managing, summarizing, and visualizing data.
20 versions - Latest release: almost 2 years ago - 1 dependent package - 14 dependent repositories - 111 thousand downloads last month - 490 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
traceml 1.14.2
Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.
86 versions - Latest release: over 2 years ago - 3 dependent packages - 8 dependent repositories - 108 thousand downloads last month - 492 stars on GitHub - 2 maintainers
Top 4.8% on pypi.org
optimuspyspark 2.2.32
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion wi...
83 versions - Latest release: almost 4 years ago - 8 dependent repositories - 10.7 thousand downloads last month - 1,441 stars on GitHub - 2 maintainers
Top 4.6% on pypi.org
popmon 1.4.6
Monitor the stability of a pandas or spark dataset
36 versions - Latest release: 10 months ago - 1 dependent package - 6 dependent repositories - 11.6 thousand downloads last month - 483 stars on GitHub - 3 maintainers
Top 6.0% on pypi.org
piperider 1.0.2
PiperRider CLI
170 versions - Latest release: about 3 years ago - 5 dependent repositories - 4.47 thousand downloads last month - 466 stars on GitHub - 1 maintainer
gate-drift 0.1.5
Data drift detection tool for machine learning pipelines.
5 versions - Latest release: about 1 year ago - 2 dependent repositories - 52 downloads last month - 19 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
cleanvision 0.3.6
Find issues in image datasets
12 versions - Latest release: 3 months ago - 3 dependent packages - 2 dependent repositories - 8.41 thousand downloads last month - 917 stars on GitHub - 6 maintainers
Top 3.9% on pypi.org
openmetadata-ingestion 0.10.1
Ingestion Framework for OpenMetadata
274 versions - Latest release: almost 2 years ago - 3 dependent packages - 2 dependent repositories - 37.1 thousand downloads last month - 4,168 stars on GitHub - 1 maintainer
haisweetviz 1.0.2
A pandas-based library to visualize and compare datasets.
2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 15 downloads last month - 2,824 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.
32 versions - Latest release: over 1 year ago - 1 dependent repositories - 349 downloads last month - 1,441 stars on GitHub - 2 maintainers
Top 6.3% on pypi.org
great-expectations-experimental 0.1.20240513031
Always know what to expect from your data.
502 versions - Latest release: 3 days ago - 1 dependent package - 1 dependent repositories - 255 thousand downloads last month - 9,129 stars on GitHub - 4 maintainers
easiersdk 0.1.16
This library contains code for interacting with EASIER.AI platform.
102 versions - Latest release: almost 3 years ago - 1 dependent repositories - 563 downloads last month - 2,824 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
openmetadata-ingestion-core 0.10.0
These are the generated Python classes from JSON Schema
12 versions - Latest release: about 2 years ago - 1 dependent package - 1 dependent repositories - 204 downloads last month - 3,365 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
openmetadata-airflow-managed-apis 0.10.1
Airflow REST APIs to create and manage DAGS
31 versions - Latest release: almost 2 years ago - 1 dependent repositories - 730 downloads last month - 3,365 stars on GitHub - 1 maintainer
openmetadata-sqlalchemy-bigquery 1.2.0
SQLAlchemy dialect for BigQuery by OpenMetadata
4 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 40 downloads last month - 4,168 stars on GitHub - 1 maintainer
haiqv-profiling 0.0.1
Generate profile report for pandas DataFrame
1 version - Latest release: over 3 years ago - 1 dependent repositories - 25 downloads last month - 12,071 stars on GitHub - 1 maintainer
metacrafter 0.0.2
Metacrafter metadata classification tool
2 versions - Latest release: almost 2 years ago - 1 dependent repositories - 17 downloads last month - 38 stars on GitHub - 1 maintainer
raymon 0.0.39
Python package for data logging and monitoring.
14 versions - Latest release: over 2 years ago - 1 dependent repositories - 9 downloads last month - 18 stars on GitHub - 1 maintainer
cleanlab-studio 2.0.4
Client interface for all things Cleanlab Studio
79 versions - Latest release: 10 days ago - 1 dependent repositories - 3.21 thousand downloads last month - 21 stars on GitHub - 4 maintainers
piperider-cli 0.1.3.12
PiperRider CLI
9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 41 downloads last month - 467 stars on GitHub - 1 maintainer
panda-helper 0.0.2
Data profiler for Pandas
2 versions - Latest release: almost 2 years ago - 1 dependent repositories - 76 downloads last month - 2 stars on GitHub - 1 maintainer
piperider-nightly 0.42.0.20240515
PiperRider CLI
540 versions - Latest release: about 19 hours ago - 5.93 thousand downloads last month - 450 stars on GitHub - 1 maintainer
dqops 1.3.0
DQOps Data Quality Operations Center
17 versions - Latest release: 7 days ago - 402 downloads last month - 52 stars on GitHub - 1 maintainer
odd-collector 0.1.18
ODD Collector
1 version - Latest release: over 1 year ago - 13 downloads last month - 40 stars on GitHub - 1 maintainer
compars 0.0.0
DataFrame comparison done right (AKA the Bear-agnostic DataFrame comparison library)
1 version - Latest release: 26 days ago - 156 downloads last month - 0 stars on GitHub - 1 maintainer
great-expectations-cta 0.15.43
Always know what to expect from your data.
2 versions - Latest release: over 1 year ago - 1 dependent package - 37 downloads last month - 9,124 stars on GitHub - 1 maintainer
pydeequalb 0.0.4
PyDeequ - Unit Tests for Data
3 versions - Latest release: over 1 year ago - 24 downloads last month - 649 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
pydeequ-alb 0.0.1 removed
PyDeequ - Unit Tests for Data
1 version - Latest release: over 1 year ago - 421 stars on GitHub
Top 9.4% on pypi.org
openmetadata-managed-apis 1.4.0.0rc2
Airflow REST APIs to create and manage DAGS
123 versions - Latest release: 6 days ago - 5.53 thousand downloads last month - 4,168 stars on GitHub - 1 maintainer
lineagemd 0.0.0
Lineage metadata for ML/AI/Data.
1 version - Latest release: over 1 year ago - 10 downloads last month - 452 stars on GitHub - 1 maintainer
hauptai 0.0.0
Haupt ai.
1 version - Latest release: over 1 year ago - 14 downloads last month - 452 stars on GitHub - 1 maintainer
example-package-elisno 2.6.24
The standard package for data-centric AI, machine learning with label errors, and automatically f...
7 versions - Latest release: 2 months ago - 65 downloads last month - 8,808 stars on GitHub - 1 maintainer
idg-metadata-client 1.0.2.0
Ingestion Framework for OpenMetadata
1 version - Latest release: 10 months ago - 33 downloads last month - 3,365 stars on GitHub - 1 maintainer
cleanlab-cli 0.1.14
Command line interface for all things Cleanlab Studio
16 versions - Latest release: over 1 year ago - 129 downloads last month - 20 stars on GitHub - 3 maintainers
desbordante 2.0.0
Science-intensive high-performance data profiler
3 versions - Latest release: 29 days ago - 1 dependent package - 138 downloads last month - 61 stars on GitHub - 1 maintainer
Top 9.8% on pypi.org
haupt 2.1.8
Lineage metadata API, artifacts streams, sandbox, ML-API, and spaces for Polyaxon.
125 versions - Latest release: 26 days ago - 1 dependent package - 947 downloads last month - 452 stars on GitHub - 1 maintainer
zarque-profiling 0.5.10
Data profiling tools for Big Data
6 versions - Latest release: 10 months ago - 419 downloads last month - 2 stars on GitHub - 1 maintainer
Related Keywords
data-science 35 data-quality 28 python 17 machine-learning 17 data-exploration 16 data-analysis 14 dataquality 14 exploratory-data-analysis 14 pandas 13 eda 13 data-validation 12 data-visualization 12 mlops 11 data-observability 11 statistics 10 data-quality-checks 10 dbt 9 deep-learning 9 jupyter 8 data-engineering 8 data-catalog 7 data-discovery 7 data-governance 7 data-cleaning 7 datacatalog 7 datadiscovery 7 spark 7 metadata 7 matplotlib 6 dataengineering 6 snowflake 6 plotly 6 pytorch 6 data-lineage 6 tensorflow 6 tracking 6 metadata-management 6 data-contracts 6 pandas-dataframe 6 dataunittest 6 exploration 6 data-unit-tests 6 data-profilers 6 hacktoberfest 5 ipython 5 polyaxon 5 dataops 5 visualization 5 data-centric-ai 5 dask 5 datacleaner 5 kubernetes 5 data-curation 4 data-labeling 4 bigquery 4 lineage 4 artificial-intelligence 4 noisy-labels 4 data-profiler 4 outlier-detection 4 dataframes 4 ai 3 reinforcement-learning 3 neural-networks 3 gcs 3 pull-requests 3 google cloud storage 3 azure 3 microsoft 3 google cloud 3 tensorFlow 3 analytics 3 pyspark 3 data-wrangling 3 data-cleansing 3 dbt-metrics 3 data-testing 3 data-reliability 3 data-pipeline 3 continuous-integration 3 code-review 3 reporting 3 computer-vision 3 image-classification 3 data-collaboration 3 pandas-summary 3 explainable-ai 3 serving 3 ui 3 pipeline-tests 3 pipeline-testing 3 pipeline-debt 3 exploratorydataanalysis 3 exploratory-analysis 3 datacleaning 3 cleandata 3 datavalidation 3 validation 3 quality 3 pipeline 3