pypi.org "data-profiling" keyword
View the packages on the pypi.org package registry that are tagged with the "data-profiling" keyword.
Top 0.5% on pypi.org
40 versions - Latest release: almost 3 years ago - 46 dependent packages - 1,970 dependent repositories - 329 thousand downloads last month - 12,108 stars on GitHub - 4 maintainers
pandas-profiling 3.6.6
Deprecated 'pandas-profiling' package, use 'ydata-profiling' instead40 versions - Latest release: almost 3 years ago - 46 dependent packages - 1,970 dependent repositories - 329 thousand downloads last month - 12,108 stars on GitHub - 4 maintainers
Top 6.0% on pypi.org
12 versions - Latest release: over 1 year ago - 3 dependent packages - 2 dependent repositories - 4.98 thousand downloads last month - 1,116 stars on GitHub - 6 maintainers
cleanvision 0.3.6
Find issues in image datasets12 versions - Latest release: over 1 year ago - 3 dependent packages - 2 dependent repositories - 4.98 thousand downloads last month - 1,116 stars on GitHub - 6 maintainers
Top 1.0% on pypi.org
32 versions - Latest release: about 1 month ago - 43 dependent packages - 79 dependent repositories - 1.34 million downloads last month - 13,224 stars on GitHub - 1 maintainer
ydata-profiling 4.17.0
Generate profile report for pandas DataFrame32 versions - Latest release: about 1 month ago - 43 dependent packages - 79 dependent repositories - 1.34 million downloads last month - 13,224 stars on GitHub - 1 maintainer
auratrace 0.1.0
AI-powered data lineage and observability tool for Python1 version - Latest release: 4 months ago - 23 downloads last month - 1 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
338 versions - Latest release: 3 days ago - 58 dependent packages - 284 dependent repositories - 23.4 million downloads last month - 9,420 stars on GitHub - 8 maintainers
great-expectations 1.8.1
Always know what to expect from your data.338 versions - Latest release: 3 days ago - 58 dependent packages - 284 dependent repositories - 23.4 million downloads last month - 9,420 stars on GitHub - 8 maintainers
example-package-elisno 2.6.24
The standard package for data-centric AI, machine learning with label errors, and automatically f...7 versions - Latest release: over 1 year ago - 35 downloads last month - 10,867 stars on GitHub - 1 maintainer
zarque-profiling 0.5.10
Data profiling tools for Big Data6 versions - Latest release: over 2 years ago - 288 downloads last month - 9 stars on GitHub - 1 maintainer
csv-mcp-server 1.0.0
MCP server for comprehensive CSV file operations with pandas-based tools1 version - Latest release: 3 months ago - 54 downloads last month - 6 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
462 versions - Latest release: over 3 years ago - 3 dependent packages - 2 dependent repositories - 257 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-ingestion 0.10.1
Ingestion Framework for OpenMetadata462 versions - Latest release: over 3 years ago - 3 dependent packages - 2 dependent repositories - 257 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
312 versions - Latest release: 3 days ago - 28 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-managed-apis 1.10.4.0
Airflow REST APIs to create and manage DAGS312 versions - Latest release: 3 days ago - 28 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
idg-metadata-client 1.0.2.0 💰
Ingestion Framework for OpenMetadata1 version - Latest release: over 2 years ago - 19 downloads last month - 7,743 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
12 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 313 downloads last month - 7,743 stars on GitHub - 1 maintainer
openmetadata-ingestion-core 0.10.0 💰
These are the generated Python classes from JSON Schema12 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 313 downloads last month - 7,743 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
31 versions - Latest release: over 3 years ago - 1 dependent repositories - 859 downloads last month - 3,365 stars on GitHub - 1 maintainer
openmetadata-airflow-managed-apis 0.10.1
Airflow REST APIs to create and manage DAGS31 versions - Latest release: over 3 years ago - 1 dependent repositories - 859 downloads last month - 3,365 stars on GitHub - 1 maintainer
openmetadata-sqlalchemy-bigquery 1.2.0
SQLAlchemy dialect for BigQuery by OpenMetadata4 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 13 downloads last month - 4,168 stars on GitHub - 1 maintainer
piperider-cli 0.1.3.12
PiperRider CLI9 versions - Latest release: over 3 years ago - 1 dependent repositories - 51 downloads last month - 489 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
41 versions - Latest release: about 2 months ago - 1 dependent package - 6 dependent repositories - 3.36 thousand downloads last month - 505 stars on GitHub - 3 maintainers
popmon 1.4.11
Monitor the stability of a pandas or spark dataset41 versions - Latest release: about 2 months ago - 1 dependent package - 6 dependent repositories - 3.36 thousand downloads last month - 505 stars on GitHub - 3 maintainers
databricks-labs-dqx 0.9.3
Data Quality eXtended (DQX) is a Python library for data quality checks and data quality monitoring24 versions - Latest release: about 1 month ago - 1.92 million downloads last month - 329 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
83 versions - Latest release: over 5 years ago - 8 dependent repositories - 2.41 thousand downloads last month - 1,517 stars on GitHub - 2 maintainers
optimuspyspark 2.2.32
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion wi...83 versions - Latest release: over 5 years ago - 8 dependent repositories - 2.41 thousand downloads last month - 1,517 stars on GitHub - 2 maintainers
acryl-great-expectations 0.15.50.1
Always know what to expect from your data.1 version - Latest release: 6 months ago - 157 thousand downloads last month - 10,884 stars on GitHub - 2 maintainers
Top 6.3% on pypi.org
530 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 497 thousand downloads last month - 9,420 stars on GitHub - 4 maintainers
great-expectations-experimental 0.1.20240917055
Always know what to expect from your data.530 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 497 thousand downloads last month - 9,420 stars on GitHub - 4 maintainers
great-expectations-cta 0.15.43
Always know what to expect from your data.2 versions - Latest release: almost 3 years ago - 1 dependent package - 20 downloads last month - 9,420 stars on GitHub - 1 maintainer
pytics 1.1.5
An interactive data profiling library for Python notebooks with rich HTML reports and PDF export ...7 versions - Latest release: 7 months ago - 23 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
170 versions - Latest release: over 4 years ago - 5 dependent repositories - 1.81 thousand downloads last month - 489 stars on GitHub - 1 maintainer
piperider 1.0.2
PiperRider CLI170 versions - Latest release: over 4 years ago - 5 dependent repositories - 1.81 thousand downloads last month - 489 stars on GitHub - 1 maintainer
haisweetviz 1.0.2
A pandas-based library to visualize and compare datasets.2 versions - Latest release: over 5 years ago - 1 dependent repositories - 16 downloads last month - 3,047 stars on GitHub - 1 maintainer
frameon 0.1.2
Frameon extends pandas DataFrame with analysis methods while keeping all original functionality i...4 versions - Latest release: 3 months ago - 30 downloads last month - 2 stars on GitHub - 1 maintainer
raymon 0.0.39
Python package for data logging and monitoring.14 versions - Latest release: about 4 years ago - 1 dependent repositories - 78 downloads last month - 18 stars on GitHub - 1 maintainer
Top 9.8% on pypi.org
170 versions - Latest release: about 2 months ago - 1 dependent package - 1.09 thousand downloads last month - 452 stars on GitHub - 2 maintainers
haupt 2.11.1
Lineage metadata API, artifacts streams, sandbox, ML-API, and spaces for Polyaxon.170 versions - Latest release: about 2 months ago - 1 dependent package - 1.09 thousand downloads last month - 452 stars on GitHub - 2 maintainers
lineagemd 0.0.0
Lineage metadata for ML/AI/Data.1 version - Latest release: about 3 years ago - 9 downloads last month - 452 stars on GitHub - 2 maintainers
hauptai 0.0.0
Haupt ai.1 version - Latest release: about 3 years ago - 14 downloads last month - 452 stars on GitHub - 2 maintainers
megaprofiler 1.0.0
megaprofiler is a highly customizable and extensible data profiling library designed to help data...5 versions - Latest release: 10 months ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
Top 4.7% on pypi.org
1 version - Latest release: about 3 years ago - 800 stars on GitHub
pydeequ-alb 0.0.1
PyDeequ - Unit Tests for Data1 version - Latest release: about 3 years ago - 800 stars on GitHub
desbordante 2.4.1
Science-intensive high-performance data profiler11 versions - Latest release: 7 days ago - 1 dependent package - 2.39 thousand downloads last month - 424 stars on GitHub - 1 maintainer
cleanlab-cli 0.1.14
Command line interface for all things Cleanlab Studio16 versions - Latest release: about 3 years ago - 94 downloads last month - 21 stars on GitHub - 3 maintainers
odd-collector 0.1.18
ODD Collector1 version - Latest release: over 2 years ago - 16 downloads last month - 44 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
20 versions - Latest release: about 3 years ago - 1 dependent package - 14 dependent repositories - 99.5 thousand downloads last month - 518 stars on GitHub - 2 maintainers
datatile 1.0.3
A library for managing, summarizing, and visualizing data.20 versions - Latest release: about 3 years ago - 1 dependent package - 14 dependent repositories - 99.5 thousand downloads last month - 518 stars on GitHub - 2 maintainers
piperider-nightly 0.42.0.20250102
PiperRider CLI706 versions - Latest release: 10 months ago - 3.94 thousand downloads last month - 479 stars on GitHub - 1 maintainer
haiqv-profiling 0.0.1
Generate profile report for pandas DataFrame1 version - Latest release: almost 5 years ago - 1 dependent repositories - 18 downloads last month - 12,108 stars on GitHub - 1 maintainer
panda-helper 0.1.2
Data profiler for Pandas7 versions - Latest release: 9 months ago - 1 dependent repositories - 39 downloads last month - 3 stars on GitHub - 1 maintainer
cleanengine 0.1.2
The Ultimate Data Cleaning & Analysis Toolkit2 versions - Latest release: 2 months ago - 1 maintainer
edexplore 1.0.1
A simple widget for interactive EDA / QA for those who use Pandas in Jupyter Notebook.1 version - Latest release: over 1 year ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
32 versions - Latest release: about 3 years ago - 1 dependent repositories - 275 downloads last month - 1,447 stars on GitHub - 2 maintainers
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.32 versions - Latest release: about 3 years ago - 1 dependent repositories - 275 downloads last month - 1,447 stars on GitHub - 2 maintainers
dqops 1.12.0
DQOps Data Quality Operations Center36 versions - Latest release: 3 months ago - 283 downloads last month - 168 stars on GitHub - 1 maintainer
gate-drift 0.1.5
Data drift detection tool for machine learning pipelines.5 versions - Latest release: over 2 years ago - 2 dependent repositories - 35 downloads last month - 21 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
95 versions - Latest release: almost 4 years ago - 3 dependent packages - 8 dependent repositories - 110 thousand downloads last month - 519 stars on GitHub - 2 maintainers
traceml 1.14.2
Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.95 versions - Latest release: almost 4 years ago - 3 dependent packages - 8 dependent repositories - 110 thousand downloads last month - 519 stars on GitHub - 2 maintainers
autocsv-profiler 2.0.0
Automated CSV data analysis with statistical profiling and visualization2 versions - Latest release: 24 days ago - 15 downloads last month - 0 stars on GitHub - 1 maintainer
databeak 0.1.2
DataBeak: MCP server for comprehensive CSV file operations with pandas-based tools6 versions - Latest release: 26 days ago - 444 downloads last month - 0 stars on GitHub - 1 maintainer
splurge-data-profiler 2025.2.0
A data profiling tool for delimited and database sources.4 versions - Latest release: about 2 months ago - 130 downloads last month - 1 stars on GitHub - 1 maintainer
easiersdk 0.1.16
This library contains code for interacting with EASIER.AI platform.102 versions - Latest release: over 4 years ago - 1 dependent repositories - 1.18 thousand downloads last month - 3,009 stars on GitHub - 1 maintainer
Top 2.1% on pypi.org
35 versions - Latest release: almost 2 years ago - 16 dependent packages - 167 dependent repositories - 103 thousand downloads last month - 3,039 stars on GitHub - 1 maintainer
sweetviz 2.3.1
A pandas-based library to visualize and compare datasets.35 versions - Latest release: almost 2 years ago - 16 dependent packages - 167 dependent repositories - 103 thousand downloads last month - 3,039 stars on GitHub - 1 maintainer
pydeequalb 0.0.4
PyDeequ - Unit Tests for Data3 versions - Latest release: about 3 years ago - 17 downloads last month - 768 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
7 versions - Latest release: almost 4 years ago - 4 dependent packages - 137 dependent repositories - 102 thousand downloads last month - 520 stars on GitHub - 2 maintainers
pandas-summary 0.2.0
An extension to pandas describe function.7 versions - Latest release: almost 4 years ago - 4 dependent packages - 137 dependent repositories - 102 thousand downloads last month - 520 stars on GitHub - 2 maintainers
Top 2.1% on pypi.org
33 versions - Latest release: 8 months ago - 11 dependent packages - 19 dependent repositories - 37.4 thousand downloads last month - 10,867 stars on GitHub - 4 maintainers
cleanlab 2.7.1
The standard package for data-centric AI, machine learning with label errors, and automatically f...33 versions - Latest release: 8 months ago - 11 dependent packages - 19 dependent repositories - 37.4 thousand downloads last month - 10,867 stars on GitHub - 4 maintainers
cleanlab-studio 2.5.21
Client interface for all things Cleanlab Studio128 versions - Latest release: 9 months ago - 1 dependent repositories - 2.73 thousand downloads last month - 21 stars on GitHub - 4 maintainers
metacrafter 0.0.4
Metacrafter metadata classification tool4 versions - Latest release: over 1 year ago - 1 dependent repositories - 12 downloads last month - 41 stars on GitHub - 1 maintainer
pyspark-analyzer 5.0.2
A comprehensive PySpark DataFrame profiler for generating detailed statistics and data quality re...21 versions - Latest release: 4 months ago - 77 downloads last month - 1 stars on GitHub - 1 maintainer
compars 0.0.0
DataFrame comparison done right (AKA the Bear-agnostic DataFrame comparison library)1 version - Latest release: over 1 year ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
hyper-aidev 0.1.1
A Python library to simplify model learning, training, and creation for powerful AI models across...2 versions - Latest release: 5 months ago - 15 downloads last month - 1 maintainer
data-ops-testgen
DataKitchen Inc. Data Quality Engine3 versions - 146 downloads last month - 124 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
14 versions - Latest release: 7 months ago - 8 dependent packages - 53 dependent repositories - 14.6 million downloads last month - 694 stars on GitHub - 2 maintainers
pydeequ 1.5.0
PyDeequ - Unit Tests for Data14 versions - Latest release: 7 months ago - 8 dependent packages - 53 dependent repositories - 14.6 million downloads last month - 694 stars on GitHub - 2 maintainers
Related Keywords
data-science
41
data-quality
40
data-analysis
21
python
21
pandas
20
machine-learning
20
exploratory-data-analysis
19
data-exploration
18
eda
18
dataquality
16
data-validation
16
data-visualization
15
mlops
14
data-observability
12
statistics
12
data-quality-checks
11
deep-learning
10
data-engineering
10
jupyter
9
data-cleaning
9
spark
9
hacktoberfest
9
data-profilers
7
data-unit-tests
7
datacleaner
7
snowflake
7
data-lineage
7
dataunittest
7
plotly
7
outlier-detection
7
visualization
7
data-catalog
7
data-discovery
7
data-governance
7
metadata
7
datadiscovery
7
dbt
7
matplotlib
6
pytorch
6
dataops
6
tensorflow
6
tracking
6
data-contracts
6
metadata-management
6
dataengineering
6
pandas-dataframe
6
exploration
6
polyaxon
5
kubernetes
5
pipeline-tests
5
exploratory-analysis
5
datacleaning
5
ipython
5
data-collaboration
5
data-centric-ai
5
csv
5
datavalidation
5
data
5
dask
5
ai
4
reinforcement-learning
4
data-reliability
4
noisy-labels
4
lineage
4
data-profiler
4
dataframes
4
data-labeling
4
data-curation
4
data-processing
4
mcp
4
testing
4
pipeline
4
quality
4
science
4
validation
4
cleandata
4
exploratorydataanalysis
4
pipeline-debt
4
pipeline-testing
4
anomaly-detection
4
data-wrangling
4
pyspark
4
jupyter-notebook
4
artificial-intelligence
4
data-pipeline
3
datacatalog
3
deequ
3
data-testing
3
continuous-integration
3
code-review
3
html-report
3
dbt-metrics
3
pull-requests
3
reporting
3
neural-networks
3
gcs
3
google cloud storage
3
azure
3
microsoft
3
s3
3