Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "data-quality" keyword

openmetadata-sqlalchemy-bigquery 1.2.0
SQLAlchemy dialect for BigQuery by OpenMetadata
4 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 37 downloads last month - 4,149 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
openmetadata-managed-apis 1.3.3.0
Airflow REST APIs to create and manage DAGS
120 versions - Latest release: 10 days ago - 3.94 thousand downloads last month - 4,149 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
openmetadata-ingestion 0.10.1
Ingestion Framework for OpenMetadata
271 versions - Latest release: almost 2 years ago - 2 dependent packages - 2 dependent repositories - 27.5 thousand downloads last month - 4,149 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
openmetadata-airflow-managed-apis 0.10.1
Airflow REST APIs to create and manage DAGS
31 versions - Latest release: almost 2 years ago - 1 dependent repositories - 582 downloads last month - 3,365 stars on GitHub - 1 maintainer
idg-metadata-client 1.0.2.0
Ingestion Framework for OpenMetadata
1 version - Latest release: 10 months ago - 29 downloads last month - 3,365 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
openmetadata-ingestion-core 0.10.0
These are the generated Python classes from JSON Schema
12 versions - Latest release: about 2 years ago - 1 dependent package - 1 dependent repositories - 184 downloads last month - 3,365 stars on GitHub - 1 maintainer
piperider-nightly 0.42.0.20240428
PiperRider CLI
527 versions - Latest release: about 7 hours ago - 4.17 thousand downloads last month - 450 stars on GitHub - 1 maintainer
hooqu 0.1.0
Data unit testing for your Python DataFrames
1 version - Latest release: over 3 years ago - 1 dependent repositories - 12 downloads last month - 25 stars on GitHub - 2 maintainers
dqops 1.2.0
DQOps Data Quality Operations Center
16 versions - Latest release: 4 days ago - 189 downloads last month - 52 stars on GitHub - 2 maintainers
Top 2.1% on pypi.org
cleanlab 2.6.3
The standard package for data-centric AI, machine learning with label errors, and automatically f...
28 versions - Latest release: about 1 month ago - 8 dependent packages - 19 dependent repositories - 16.1 thousand downloads last month - 8,645 stars on GitHub - 5 maintainers
example-package-elisno 2.6.24
The standard package for data-centric AI, machine learning with label errors, and automatically f...
7 versions - Latest release: about 2 months ago - 52 downloads last month - 8,645 stars on GitHub - 1 maintainer
Top 1.0% on pypi.org
ydata-profiling 4.7.0
Generate profile report for pandas DataFrame
18 versions - Latest release: about 1 month ago - 20 dependent packages - 79 dependent repositories - 1.21 million downloads last month - 11,645 stars on GitHub - 1 maintainer
cleanlab-studio 2.0.2
Client interface for all things Cleanlab Studio
77 versions - Latest release: 9 days ago - 1 dependent repositories - 3.11 thousand downloads last month - 20 stars on GitHub - 5 maintainers
Top 1.9% on pypi.org
pydeequ 1.3.0
PyDeequ - Unit Tests for Data
12 versions - Latest release: 2 days ago - 6 dependent packages - 53 dependent repositories - 11.6 million downloads last month - 645 stars on GitHub - 4 maintainers
Top 3.5% on pypi.org
datatile 1.0.3
A library for managing, summarizing, and visualizing data.
20 versions - Latest release: over 1 year ago - 1 dependent package - 14 dependent repositories - 111 thousand downloads last month - 490 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
traceml 1.14.2
Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.
86 versions - Latest release: over 2 years ago - 3 dependent packages - 8 dependent repositories - 121 thousand downloads last month - 492 stars on GitHub - 2 maintainers
Top 1.2% on pypi.org
whylogs 1.3.30
Profile and monitor your ML data pipeline end-to-end
306 versions - Latest release: 11 days ago - 5 dependent packages - 413 dependent repositories - 326 thousand downloads last month - 2,482 stars on GitHub - 5 maintainers
Top 2.2% on pypi.org
fiftyone-db 1.1.2
FiftyOne DB
21 versions - Latest release: about 2 months ago - 1 dependent package - 36 dependent repositories - 62.8 thousand downloads last month - 6,627 stars on GitHub - 4 maintainers
cz-data-diff 0.0.4
Command-line tool and Python library to efficiently diff rows across two different databases.
3 versions - Latest release: 4 months ago - 26 downloads last month - 2,657 stars on GitHub - 4 maintainers
pydeequalb 0.0.4
PyDeequ - Unit Tests for Data
3 versions - Latest release: over 1 year ago - 26 downloads last month - 643 stars on GitHub - 2 maintainers
Top 4.7% on pypi.org
pydeequ-alb 0.0.1 removed
PyDeequ - Unit Tests for Data
1 version - Latest release: over 1 year ago - 421 stars on GitHub
diqu-email 0.0.2
Data Quality CLI for the Auto-Alerts - Emails
2 versions - Latest release: 5 months ago - 18 downloads last month - 1 maintainer
Top 0.7% on pypi.org
great-expectations 0.18.12
Always know what to expect from your data.
262 versions - Latest release: about 1 month ago - 42 dependent packages - 284 dependent repositories - 18.6 million downloads last month - 9,420 stars on GitHub - 8 maintainers
Top 2.5% on pypi.org
pandas-summary 0.2.0
An extension to pandas describe function.
7 versions - Latest release: over 2 years ago - 4 dependent packages - 137 dependent repositories - 93.4 thousand downloads last month - 490 stars on GitHub - 2 maintainers
Top 0.5% on pypi.org
pandas-profiling 3.6.6
Deprecated 'pandas-profiling' package, use 'ydata-profiling' instead
40 versions - Latest release: about 1 year ago - 46 dependent packages - 1,970 dependent repositories - 653 thousand downloads last month - 12,039 stars on GitHub - 4 maintainers
cuallee 0.10.0
Python library for data validation on DataFrame APIs including Snowflake/Snowpark, Apache/PySpark...
73 versions - Latest release: about 1 month ago - 1 dependent package - 1 dependent repositories - 11.9 thousand downloads last month - 107 stars on GitHub - 2 maintainers
Top 7.0% on pypi.org
tangled-up-in-unicode 0.2.0
Access to the Unicode Character Database (UCD)
9 versions - Latest release: over 2 years ago - 11 dependent packages - 722 dependent repositories - 861 thousand downloads last month - 3 stars on GitHub - 2 maintainers
Top 1.0% on pypi.org
feast 0.37.1
Python SDK for Feast
118 versions - Latest release: 11 days ago - 13 dependent packages - 140 dependent repositories - 147 thousand downloads last month - 5,027 stars on GitHub - 5 maintainers
data-expectations 1.7.0
Are your data meeting all your expecations
10 versions - Latest release: 7 months ago - 1 dependent package - 1 dependent repositories - 13.5 thousand downloads last month - 1 stars on GitHub - 2 maintainers
cleanlab-cli 0.1.14
Command line interface for all things Cleanlab Studio
16 versions - Latest release: over 1 year ago - 129 downloads last month - 20 stars on GitHub - 6 maintainers
dataexpectations 0.0.6
Is your data meeting all your expecations
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 10 downloads last month - 1 stars on GitHub - 2 maintainers
kestra 0.12.0
Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, schedu...
3 versions - Latest release: 6 months ago - 1 dependent repositories - 109 thousand downloads last month - 6,340 stars on GitHub - 2 maintainers
seedspark 0.4.3
SeedSpark is an Extensible PySpark utility package to create production spark pipelines and dev-t...
14 versions - Latest release: 9 months ago - 1 dependent repositories - 89 downloads last month - 2 stars on GitHub - 2 maintainers
redflag 0.5.0
Safety net for machine learning pipelines.
30 versions - Latest release: 6 days ago - 1 dependent repositories - 297 downloads last month - 19 stars on GitHub - 2 maintainers
Top 6.3% on pypi.org
great-expectations-experimental 0.1.20240422039
Always know what to expect from your data.
497 versions - Latest release: 6 days ago - 1 dependent repositories - 240 thousand downloads last month - 9,129 stars on GitHub - 7 maintainers
Top 5.8% on pypi.org
re-data 0.11.0
re_data - data quality framework
49 versions - Latest release: 4 months ago - 2 dependent repositories - 64 thousand downloads last month - 1,495 stars on GitHub - 1 maintainer
compars 0.0.0
DataFrame comparison done right (AKA the Bear-agnostic DataFrame comparison library)
1 version - Latest release: 8 days ago - 134 downloads last month - 0 stars on GitHub - 2 maintainers
thetis 0.2.0
Service to examine data processing pipelines (e.g., machine learning or deep learning pipelines) ...
5 versions - Latest release: about 1 month ago - 44 downloads last month - 1 stars on GitHub - 2 maintainers
pydvl 0.9.1
The Python Data Valuation Library
13 versions - Latest release: 7 days ago - 288 downloads last month - 65 stars on GitHub - 3 maintainers
thetiscore 0.2.0
Service to examine data processing pipelines (e.g., machine learning or deep learning pipelines) ...
6 versions - Latest release: about 1 month ago - 1 dependent package - 92 downloads last month - 1 stars on GitHub - 1 maintainer
mapology 2023.10.21
Simple python filters generating leaflet driven apps.
9 versions - Latest release: 6 months ago - 1 dependent repositories - 992 downloads last month - 1 maintainer
Top 6.0% on pypi.org
piperider 1.0.2
PiperRider CLI
170 versions - Latest release: almost 3 years ago - 5 dependent repositories - 4.43 thousand downloads last month - 466 stars on GitHub - 1 maintainer
leila 0.2
Librería para medir la calidad de los datos en conjuntos de datos estructurados
2 versions - Latest release: over 2 years ago - 2 dependent repositories - 287 downloads last month - 59 stars on GitHub - 2 maintainers
piperider-cli 0.1.3.12
PiperRider CLI
9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 41 downloads last month - 467 stars on GitHub - 2 maintainers
glassesvalidator 1.2.4
Automatic determination of accuracy of wearable eye tracker recordings.
19 versions - Latest release: about 2 months ago - 84 downloads last month - 7 stars on GitHub - 1 maintainer
Top 4.5% on pypi.org
data-diff 0.11.1
Command-line tool and Python library to efficiently diff rows across two different databases.
74 versions - Latest release: 2 months ago - 1 dependent package - 2 dependent repositories - 56.8 thousand downloads last month - 2,657 stars on GitHub - 10 maintainers
Top 7.5% on pypi.org
feathr 1.0.0
An Enterprise-Grade, High Performance Feature Store
22 versions - Latest release: about 1 year ago - 1 dependent repositories - 1.8 thousand downloads last month - 1,928 stars on GitHub - 1 maintainer
data_check 0.19.0
simple data validation
22 versions - Latest release: about 1 month ago - 322 downloads last month - 4 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
encord-active 0.1.83
Enable users to improve machine learning models in an active learning fashion via data, label, an...
76 versions - Latest release: 3 months ago - 1 dependent repositories - 512 downloads last month - 420 stars on GitHub - 1 maintainer
featureform-enterprise 0.12.1
Package for the Featureform Enterprise Feature Store
49 versions - Latest release: about 1 month ago - 3.24 thousand downloads last month - 1,672 stars on GitHub - 2 maintainers
badgers 0.0.5
Badgers: bad data generators
5 versions - Latest release: about 1 month ago - 94 downloads last month - 8 stars on GitHub - 2 maintainers
feathub-nightly 0.2.dev20231231
A stream-batch unified feature store for real-time machine learning
388 versions - Latest release: 4 months ago - 69 downloads last month - 293 stars on GitHub - 1 maintainer
feathub 0.1.0
A stream-batch unified feature store for real-time machine learning
2 versions - Latest release: 7 months ago - 3 downloads last month - 293 stars on GitHub - 1 maintainer
feastmo 0.341.0
Python SDK for Feast
8 versions - Latest release: 6 months ago - 152 downloads last month - 4,988 stars on GitHub - 2 maintainers
lakefs 0.6.0
lakeFS Python SDK Wrapper
15 versions - Latest release: 12 days ago - 12.3 thousand downloads last month - 4,053 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
lakefs-sdk 1.20.0
lakeFS API
34 versions - Latest release: 12 days ago - 3 dependent packages - 1 dependent repositories - 17.6 thousand downloads last month - 4,054 stars on GitHub - 2 maintainers
Top 3.2% on pypi.org
lakefs-client 1.20.0
lakeFS API
150 versions - Latest release: 12 days ago - 4 dependent packages - 5 dependent repositories - 15.4 thousand downloads last month - 4,053 stars on GitHub - 1 maintainer
Top 7.2% on pypi.org
fiftyone-desktop 0.33.7
FiftyOne Desktop
60 versions - Latest release: 13 days ago - 1 dependent package - 1 dependent repositories - 1.33 thousand downloads last month - 6,627 stars on GitHub - 6 maintainers
airflow-provider-great-expectations-cta 0.2.4
An Apache Airflow provider for Great Expectations
4 versions - Latest release: over 1 year ago - 24 downloads last month - 151 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
airflow-provider-great-expectations 0.2.8
An Apache Airflow provider for Great Expectations
23 versions - Latest release: 2 months ago - 1 dependent package - 31 dependent repositories - 127 thousand downloads last month - 151 stars on GitHub - 9 maintainers
contessa 0.2.12
Data-quality framework
14 versions - Latest release: almost 3 years ago - 1 dependent repositories - 23 downloads last month - 18 stars on GitHub - 4 maintainers
dac 0.4.2
Tool to distribute data as code
13 versions - Latest release: 3 months ago - 1 dependent repositories - 394 downloads last month - 6 stars on GitHub - 2 maintainers
great-expectations-cta 0.15.43
Always know what to expect from your data.
2 versions - Latest release: over 1 year ago - 1 dependent package - 35 downloads last month - 9,124 stars on GitHub - 1 maintainer
fiftyone-db-ubuntu2204 0.4.0
FiftyOne DB
1 version - Latest release: 12 months ago - 6.7 thousand downloads last month - 6,627 stars on GitHub - 2 maintainers
fiftyone-db-ubuntu2004 0.4.0
FiftyOne DB
1 version - Latest release: over 1 year ago - 1 dependent repositories - 15 downloads last month - 6,627 stars on GitHub - 1 maintainer
fiftyone-db-debian9 0.4.0
FiftyOne DB
6 versions - Latest release: over 1 year ago - 1 dependent repositories - 10 downloads last month - 6,627 stars on GitHub - 2 maintainers
fiftyone-db-ubuntu1604 0.3.0
FiftyOne DB
5 versions - Latest release: over 3 years ago - 1 dependent repositories - 15 downloads last month - 6,619 stars on GitHub - 2 maintainers
fiftyone-eval-only 0.14.3
FiftyOne, for evaluation only.
1 version - Latest release: over 2 years ago - 1 dependent repositories - 6 downloads last month - 6,625 stars on GitHub - 2 maintainers
fiftyone-db-rhel7 0.4.0
FiftyOne DB
3 versions - Latest release: over 1 year ago - 1 dependent repositories - 17 downloads last month - 6,625 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
cleanvision 0.3.6
Find issues in image datasets
12 versions - Latest release: 3 months ago - 2 dependent packages - 2 dependent repositories - 10.8 thousand downloads last month - 917 stars on GitHub - 10 maintainers
panda-patrol 0.0.102
🐼 Patrol your data tests
40 versions - Latest release: 5 months ago - 1 dependent repositories - 90 downloads last month - 21 stars on GitHub - 2 maintainers
haiqv-profiling 0.0.1
Generate profile report for pandas DataFrame
1 version - Latest release: over 3 years ago - 1 dependent repositories - 3 downloads last month - 11,966 stars on GitHub - 2 maintainers
diqu 0.1.2
Data Quality CLI for the Auto-Alerts
11 versions - Latest release: about 2 months ago - 1 dependent package - 59 downloads last month - 12 stars on GitHub - 2 maintainers
qprofiler 0.3.0
profile tabular datasets, manage automatic validation for new datasets, automatic handling for qu...
6 versions - Latest release: 7 months ago - 27 downloads last month - 0 stars on GitHub - 2 maintainers
reflekt 0.6.0
A CLI tool to define event schemas, lint them to enforce conventions, publish to a schema registr...
78 versions - Latest release: about 2 months ago - 1 dependent repositories - 248 downloads last month - 71 stars on GitHub - 2 maintainers
data-diff-customize 1.0.3
Command-line tool and Python library to efficiently diff rows across two different databases.
5 versions - Latest release: 6 months ago - 52 downloads last month - 2,657 stars on GitHub - 2 maintainers
completely 0.1.0
A simple tool to measure data completeness
1 version - Latest release: over 3 years ago - 3 dependent repositories - 547 downloads last month - 0 stars on GitHub - 2 maintainers
sqlbucket 0.4.4
SQLBucket - Write your SQL ETL flow and ETL integrity tool.
65 versions - Latest release: over 1 year ago - 1 dependent repositories - 47 downloads last month - 71 stars on GitHub - 6 maintainers
lakehouse-engine 1.19.0
A Spark framework serving as the engine for several lakehouse algorithms and data flows.
8 versions - Latest release: about 2 months ago - 746 downloads last month - 6 stars on GitHub - 2 maintainers
featureform 1.12.5
Package for the Featureform Feature Store
145 versions - Latest release: about 1 month ago - 1 dependent repositories - 776 downloads last month - 1,669 stars on GitHub - 2 maintainers
marshmallow-pyspark 0.2.4
PySpark data serializer
6 versions - Latest release: 4 months ago - 1 dependent repositories - 2.09 thousand downloads last month - 12 stars on GitHub - 2 maintainers
practicalai 0.0.1
practicalAI · A practical approach to machine learning
5 versions - Latest release: almost 5 years ago - 1 dependent repositories - 19 downloads last month - 35,415 stars on GitHub - 2 maintainers
Top 9.9% on pypi.org
soda-spark 0.3.3
Soda SQL API for PySpark data frame
11 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 50.1 thousand downloads last month - 60 stars on GitHub - 2 maintainers
embeddinghub 0.0.1
Data infrastructure for machine learning embeddings.
10 versions - Latest release: over 2 years ago - 1 dependent repositories - 41 downloads last month - 1,665 stars on GitHub - 1 maintainer
driftdb 0.1.4
Historical metric store
64 versions - Latest release: 3 months ago - 708 downloads last month - 278 stars on GitHub - 2 maintainers
datagit 0.18.6
Git based metric store
38 versions - Latest release: 7 months ago - 34 downloads last month - 278 stars on GitHub - 2 maintainers
dbt-snapshot-analysis 0.2.13
A package for analyzing snapshots
16 versions - Latest release: 9 months ago - 16 downloads last month - 278 stars on GitHub - 1 maintainer
datadetective 0.4.0
The Data Detective, for better Machine Learning & AI
2 versions - Latest release: over 1 year ago - 5 downloads last month - 1 stars on GitHub - 2 maintainers
dag-dq-generator 1.0.5
DPaaS Airflow DAG (Dynamic Acyclic Graph) and DQ (Data Quality) generator
5 versions - Latest release: 5 months ago - 21 downloads last month - 34,031 stars on GitHub - 2 maintainers
checkengine 0.2.0 removed
Data-quality checks for PySpark
1 version - Latest release: almost 3 years ago - 16 downloads last month - 30 stars on GitHub
Related Keywords
data-science 57 python 37 machine-learning 31 data-engineering 30 data-profiling 29 dataquality 21 mlops 18 deep-learning 18 dbt 17 data-quality-checks 16 data-cleaning 16 data-observability 15 data-centric-ai 15 data-validation 14 data 14 snowflake 14 data-curation 12 computer-vision 12 image-classification 11 active-learning 11 artificial-intelligence 11 exploratory-data-analysis 11 eda 10 data-exploration 10 data-unit-tests 10 visualization 10 pandas 9 object-detection 9 developer-tools 9 data-analysis 9 data-governance 9 dataengineering 9 data-lineage 9 validation 8 vector-search 8 data-testing 8 feature-store 8 unstructured-data 8 hacktoberfest 7 data-version-control 7 dataunittest 7 data-profilers 7 quality 7 spark 7 analytics 7 bigquery 7 data-reliability 7 ml 6 metadata-management 6 metadata 6 datadiscovery 6 data-pipeline 6 dataops 6 datacatalog 6 data-discovery 6 feature-engineering 6 data-contracts 6 data-visualization 6 dbt-metrics 6 data-catalog 6 statistics 6 pyspark 5 sql 5 data-monitoring 5 big-data 5 noisy-labels 5 pipeline 5 data-quality-monitoring 5 data-versioning 4 pydeequ 4 apache-spark 4 neural-networks 4 ai 4 database 4 dataframes 4 pytorch 4 redshift 4 classification 4 data-labeling 4 testing 4 outlier-detection 4 exploration 4 mysql 4 postgres 4 dbt-packages 4 etl 3 vector-database 3 dask 3 datacleaner 3 cleandata 3 explainable-ai 3 datavalidation 3 pandas-summary 3 oracle-database 3 pipelines 3 datacleaning 3 exploratory-analysis 3 tensorflow 3 tracking 3 embeddings-similarity 3