pypi.org "dataengineering" keyword
Top 8.8% on pypi.org
4 versions - Latest release: 10 months ago - 1 dependent repositories - 3.94 thousand downloads last month - 1,093 stars on GitHub - 1 maintainer
zingg 0.5.0
Zingg Entity Resolution, Data Mastering and Deduplication4 versions - Latest release: 10 months ago - 1 dependent repositories - 3.94 thousand downloads last month - 1,093 stars on GitHub - 1 maintainer
aws-orbit-redshift 1.4.0
Orbit Workbench Redshift Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 159 downloads last month - 147 stars on GitHub - 3 maintainers
athenasql 0.1.1
SQL builder for AWS Athena, inspired by sparkSQL14 versions - Latest release: over 1 year ago - 77 downloads last month - 6 stars on GitHub - 1 maintainer
etlworkers 0.0.6
A Data Engineering package6 versions - Latest release: over 4 years ago - 1 dependent repositories - 37 downloads last month - 2 stars on GitHub - 1 maintainer
kedro-static-viz 0.4.4 💰
Creates a static visualization of your pipeline12 versions - Latest release: about 5 years ago - 1 dependent repositories - 79 downloads last month - 28 stars on GitHub - 1 maintainer
aws-orbit-ray 1.4.0
Launch a Pod for the team space that executes a script given by the user13 versions - Latest release: over 4 years ago - 1 dependent repositories - 155 downloads last month - 147 stars on GitHub - 3 maintainers
aws-orbit-sdk 1.4.0
AWS Orbit Workbench SDK19 versions - Latest release: over 4 years ago - 1 dependent repositories - 483 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_flat_file 0.2.2
18 versions - Latest release: over 2 years ago - 91 downloads last month - 310 stars on GitHub - 2 maintainersrawbuilder 0.0.7
an elegant datasets factory7 versions - Latest release: about 4 years ago - 1 dependent repositories - 34 downloads last month - 6 stars on GitHub - 1 maintainer
aws-orbit-code-commit 1.4.0
Orbit Workbench CodeCommit Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 320 downloads last month - 147 stars on GitHub - 3 maintainers
aws-orbit-lustre 1.4.0
Add support for FSX Lustre for high-performance file system13 versions - Latest release: over 4 years ago - 1 dependent repositories - 115 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_redshift 0.1.1
18 versions - Latest release: over 2 years ago - 62 downloads last month - 313 stars on GitHub - 2 maintainersgrai_source_cube 0.0.2
4 versions - Latest release: about 2 years ago - 21 downloads last month - 313 stars on GitHub - 1 maintainer
Top 6.1% on pypi.org
grai-client 0.3.5
61 versions - Latest release: over 2 years ago - 16 dependent packages - 3 dependent repositories - 219 downloads last month - 313 stars on GitHub - 2 maintainersthe-guide 0.1.36
1 version - Latest release: over 2 years ago - 11 downloads last month - 310 stars on GitHub - 2 maintainersgrai_source_looker 0.0.3
11 versions - Latest release: over 2 years ago - 48 downloads last month - 313 stars on GitHub - 2 maintainersdftools-snowflake 0.1.1
Data Flooder Tools - Snowflake Package7 versions - Latest release: over 2 years ago - 46 downloads last month - 1 maintainer
aws-orbit-hello-world 1.4.0
Minimal Orbit Workbench Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 192 downloads last month - 147 stars on GitHub - 3 maintainers
normandy 0.2.3
A data pipeline framework.5 versions - Latest release: over 4 years ago - 1 dependent repositories - 49 downloads last month - 0 stars on GitHub - 1 maintainer
aws-orbit-team-script-launcher 1.4.0
Launch a Pod for the team space that executes a script given by the user19 versions - Latest release: over 4 years ago - 1 dependent repositories - 106 downloads last month - 147 stars on GitHub - 3 maintainers
Top 9.9% on pypi.org
grai-graph 0.2.5
24 versions - Latest release: over 2 years ago - 2 dependent packages - 1 dependent repositories - 133 downloads last month - 313 stars on GitHub - 2 maintainerspandas-aws 0.1.6
AWS helpers for data engineers and data scientists. Easily interacts with AWS from and to pandas....7 versions - Latest release: about 5 years ago - 1 dependent repositories - 38 downloads last month - 1 stars on GitHub - 1 maintainer
lakeops 1.3.0
Data lake operations toolkit16 versions - Latest release: about 1 month ago - 1.18 thousand downloads last month - 1 stars on GitHub - 1 maintainer
projectoneflow-framework 1.0.0
This Project is wrapper for projectoneflow ingestion patterns1 version - Latest release: 5 months ago - 8 downloads last month - 2 stars on GitHub - 1 maintainer
sqlmesh-utils 0.1.1
Utilities for SQLMesh5 versions - Latest release: 7 months ago - 1.3 thousand downloads last month - 2,304 stars on GitHub - 1 maintainer
aws-orbit-custom-cfn 1.4.0
Launch a CloudFormation stack for the team space19 versions - Latest release: over 4 years ago - 1 dependent repositories - 90 downloads last month - 147 stars on GitHub - 3 maintainers
projectoneflow 1.0.0
This Project is data engineering framework which implements all data engineering ingestion patterns1 version - Latest release: 5 months ago - 9 downloads last month - 1 stars on GitHub - 1 maintainer
collate-data-diff 0.11.10
Command-line tool and Python library to efficiently diff rows across two different databases.8 versions - Latest release: 3 months ago - 768 thousand downloads last month - 2,988 stars on GitHub - 3 maintainers
sparkdataset 1.0.0
Provides instant access to many popular datasets right from Pyspark (in dataframe structure).1 version - Latest release: over 4 years ago - 1 dependent repositories - 10 downloads last month - 34 stars on GitHub - 1 maintainer
grai_source_mysql 0.1.1
19 versions - Latest release: over 2 years ago - 68 downloads last month - 313 stars on GitHub - 2 maintainersgrai-source-openlineage 0.1.0a1
1 version - Latest release: over 2 years ago - 23 downloads last month - 313 stars on GitHub - 2 maintainersdq-tester 0.4.0
A lightweight data quality testing tool for CSV files and databases3 versions - Latest release: 7 months ago - 13 downloads last month - 6 stars on GitHub - 1 maintainer
Top 4.5% on pypi.org
75 versions - Latest release: almost 2 years ago - 1 dependent package - 2 dependent repositories - 40.6 thousand downloads last month - 2,989 stars on GitHub - 4 maintainers
data-diff 0.11.2
Command-line tool and Python library to efficiently diff rows across two different databases.75 versions - Latest release: almost 2 years ago - 1 dependent package - 2 dependent repositories - 40.6 thousand downloads last month - 2,989 stars on GitHub - 4 maintainers
data-diff-customize 1.0.3
Command-line tool and Python library to efficiently diff rows across two different databases.5 versions - Latest release: over 2 years ago - 62 downloads last month - 2,946 stars on GitHub - 1 maintainer
sqlmesh-cube 0.1.1.dev3
SQLMesh extension for generating Cube semantic layer configurations5 versions - Latest release: over 1 year ago - 31 downloads last month - 2,695 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
31 versions - Latest release: almost 4 years ago - 1 dependent repositories - 175 downloads last month - 3,365 stars on GitHub - 1 maintainer
openmetadata-airflow-managed-apis 0.10.1
Airflow REST APIs to create and manage DAGS31 versions - Latest release: almost 4 years ago - 1 dependent repositories - 175 downloads last month - 3,365 stars on GitHub - 1 maintainer
aws-orbit-overprovisioning 1.4.0
Launch a Pod for the team space that executes a script given by the user13 versions - Latest release: over 4 years ago - 1 dependent repositories - 116 downloads last month - 147 stars on GitHub - 3 maintainers
idg-metadata-client 1.0.2.0 💰
Ingestion Framework for OpenMetadata1 version - Latest release: almost 3 years ago - 5 downloads last month - 7,743 stars on GitHub - 1 maintainer
prodmodel 0.4.3
Build data science pipelines and models25 versions - Latest release: almost 7 years ago - 1 dependent repositories - 175 downloads last month - 58 stars on GitHub - 1 maintainer
datu-core 0.0.3
LLM-Driven Data Transformations4 versions - Latest release: 7 months ago - 89 downloads last month - 41 stars on GitHub - 2 maintainers
Top 7.3% on pypi.org
2,297 versions - Latest release: 24 days ago - 1 dependent package - 1 dependent repositories - 419 thousand downloads last month - 2,802 stars on GitHub - 7 maintainers
sqlmesh 0.232.0
Next-generation data transformation framework2,297 versions - Latest release: 24 days ago - 1 dependent package - 1 dependent repositories - 419 thousand downloads last month - 2,802 stars on GitHub - 7 maintainers
lingualeo-sqlmesh 0.0.2.dev1
Scalable and efficient data transformation framework - backwards compatible with dbt.1 version - Latest release: over 1 year ago - 21 downloads last month - 2,720 stars on GitHub - 1 maintainer
leeroopedia-mcp 0.1.6
MCP server for Leeroopedia ML/AI knowledge search7 versions - Latest release: about 2 months ago - 370 downloads last month - 2 stars on GitHub - 1 maintainer
grai_schemas 0.2.11
61 versions - Latest release: over 2 years ago - 189 downloads last month - 310 stars on GitHub - 2 maintainersgrai_source_dbt_cloud 0.1.5
18 versions - Latest release: about 2 years ago - 176 downloads last month - 310 stars on GitHub - 2 maintainerscamelcasing 0.1.3
Converts PascalCase or snake_case strings to camelCase.13 versions - Latest release: about 3 years ago - 1.05 thousand downloads last month - 2 stars on GitHub - 1 maintainer
grai-cli 0.2.7
20 versions - Latest release: over 1 year ago - 70 downloads last month - 310 stars on GitHub - 2 maintainersopenmetadata-sqlalchemy-bigquery 1.2.0
SQLAlchemy dialect for BigQuery by OpenMetadata4 versions - Latest release: over 4 years ago - 1 dependent package - 1 dependent repositories - 44 downloads last month - 4,168 stars on GitHub - 1 maintainer
grai_source_dbt 0.3.5
42 versions - Latest release: about 2 years ago - 168 downloads last month - 313 stars on GitHub - 2 maintainersgrai_source_snowflake 0.1.2
29 versions - Latest release: over 2 years ago - 91 downloads last month - 310 stars on GitHub - 2 maintainerspytzen 1.1.6 💰
DATA LAB93 versions - Latest release: over 1 year ago - 1 dependent repositories - 110 downloads last month - 1 stars on GitHub - 1 maintainer
grai_source_fivetran 0.1.2
18 versions - Latest release: over 2 years ago - 75 downloads last month - 313 stars on GitHub - 2 maintainers
Top 9.4% on pypi.org
385 versions - Latest release: about 1 month ago - 39.1 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-managed-apis 1.12.3.0
Airflow REST APIs to create and manage DAGS385 versions - Latest release: about 1 month ago - 39.1 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
535 versions - Latest release: almost 4 years ago - 3 dependent packages - 2 dependent repositories - 491 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-ingestion 0.10.1
Ingestion Framework for OpenMetadata535 versions - Latest release: almost 4 years ago - 3 dependent packages - 2 dependent repositories - 491 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
12 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 504 downloads last month - 7,743 stars on GitHub - 1 maintainer
openmetadata-ingestion-core 0.10.0 💰
These are the generated Python classes from JSON Schema12 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 504 downloads last month - 7,743 stars on GitHub - 1 maintainer
sysxtract 1.0.0
Extract logs based off events from sysmon. Comes as a package, cli and ui.1 version - Latest release: almost 6 years ago - 1 dependent repositories - 13 downloads last month - 3 stars on GitHub - 1 maintainer
pdfstract 1.1.1
PDFStract - The Extraction and Chunking Layer in Your RAG Pipeline - Available as CLI - WEBUI - API7 versions - Latest release: about 1 month ago - 170 downloads last month - 116 stars on GitHub - 1 maintainer
aws-orbit-voila 1.4.0
Launch a Pod for the team space that executes a script given by the user11 versions - Latest release: over 4 years ago - 1 dependent repositories - 305 downloads last month - 147 stars on GitHub - 3 maintainers
cz-data-diff 0.0.4
Command-line tool and Python library to efficiently diff rows across two different databases.3 versions - Latest release: over 2 years ago - 25 downloads last month - 2,988 stars on GitHub - 2 maintainers
Top 7.3% on pypi.org
15 versions - Latest release: about 3 years ago - 3 dependent repositories - 120 downloads last month - 270 stars on GitHub - 2 maintainers
aws-ddk 0.6.2
AWS DataOps Development Kit - CLI15 versions - Latest release: about 3 years ago - 3 dependent repositories - 120 downloads last month - 270 stars on GitHub - 2 maintainers
aws-orbit 1.4.0
Data & ML Unified Development and Production Environment.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 186 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_mssql 0.1.3
21 versions - Latest release: about 2 years ago - 96 downloads last month - 310 stars on GitHub - 2 maintainers
Top 9.6% on pypi.org
grai-source-postgres 0.2.4
30 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 128 downloads last month - 310 stars on GitHub - 2 maintainers
Top 7.0% on pypi.org
23 versions - Latest release: about 1 year ago - 3 dependent repositories - 2.15 thousand downloads last month - 270 stars on GitHub - 3 maintainers
aws-ddk-core 1.4.1
The AWS DataOps Development Kit is an open source development framework for customers that build ...23 versions - Latest release: about 1 year ago - 3 dependent repositories - 2.15 thousand downloads last month - 270 stars on GitHub - 3 maintainers
aws-orbit-sm-operator 1.4.0
Launch a Pod for the team space that executes a script given by the user11 versions - Latest release: over 4 years ago - 1 dependent repositories - 338 downloads last month - 147 stars on GitHub - 3 maintainers
clarifai-datautils 0.0.7
Clarifai Data Utils7 versions - Latest release: over 1 year ago - 79 downloads last month - 7 stars on GitHub - 1 maintainer
zenopy 2022.10.29
zenopy: A Python wrapper package for Zenodo API3 versions - Latest release: over 3 years ago - 1 dependent repositories - 12 downloads last month - 7 stars on GitHub - 1 maintainer
dftools-core 0.1.1
Data Flooder Tools - Core Package18 versions - Latest release: over 2 years ago - 1 dependent package - 125 downloads last month - 1 maintainer
metadata-guardian 0.3.0
MetadataGuardian is used to protect data by searching the source metadata.13 versions - Latest release: over 1 year ago - 1 dependent repositories - 35 downloads last month - 18 stars on GitHub - 1 maintainer
recohut 0.0.11
A python library for building recommender systems.4 versions - Latest release: over 4 years ago - 1 dependent repositories - 86 downloads last month - 14 stars on GitHub - 1 maintainer
livyc 0.0.14
Apache Livy Client11 versions - Latest release: almost 4 years ago - 30 downloads last month - 3 stars on GitHub - 1 maintainer
aws-orbit-jupyterlab-orbit 1.4.0
AWS Orbit Workbench JupyterLab extension.11 versions - Latest release: over 4 years ago - 1 dependent repositories - 77 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_bigquery 0.2.4
29 versions - Latest release: over 2 years ago - 111 downloads last month - 313 stars on GitHub - 2 maintainersaws-orbit-emr-on-eks 1.4.0
Allow users to run EMR jobs on their EKS namespace18 versions - Latest release: over 4 years ago - 1 dependent repositories - 256 downloads last month - 147 stars on GitHub - 3 maintainers
Related Keywords
python
36
data
35
redshift
32
snowflake
30
dbt
30
data-science
29
data-lineage
24
postgresql
22
hacktoberfest
22
mysql
22
parquet
18
open-source
18
mssql
18
datalineage
18
django
18
fivetran
18
aws
17
datalake
16
analytics
16
data-analysis
15
workbench
14
orbit-workbench
14
mach
14
kubernetes
14
jupyterhub
14
jupyter
14
gpu
14
eks-cluster
14
eks
14
cdk
12
dataquality
12
data-quality
10
sql
9
datascience
8
metadata-management
7
metadata
7
database
6
datadiscovery
6
data-validation
6
dataops
6
data-quality-checks
6
data-catalog
6
data-profiling
6
data-observability
6
data-governance
6
data-discovery
6
data-contracts
6
trino
5
data-collaboration
5
data-engineering
5
etl
5
data-quality-monitoring
4
data-diffing
4
transformation
4
databricks-sql
4
oracle-database
4
postgres
4
rdbms
4
elt
4
spark
4
pyspark
4
mcp
3
Python
2
delta-lake
2
projectoneflow
2
unstructured-data
2
dataset
2
bigquery
2
datacatalog
2
mcp-server
2
ai
2
machinelearning
2
llm
2
dftools
2
dataflooder
2
df
2
datapipeline
2
datasets
2
pyhton
1
livy-docker
1
livy-client
1
docker
1
big-data
1
apache-spark
1
apache-livy
1
training
1
bootcamp
1
deeplearning
1
recsys
1
pii-detection
1
metadata-parser
1
metadata-information
1
metadata-extraction
1
bigdata
1
Jupyter
1
analysis
1
infosec
1
security
1
softwareengineering
1
pypi
1