pypi.org "dataengineering" keyword
pandas-aws 0.1.6
AWS helpers for data engineers and data scientists. Easily interacts with AWS from and to pandas....7 versions - Latest release: almost 5 years ago - 1 dependent repositories - 38 downloads last month - 1 stars on GitHub - 1 maintainer
lakeops 1.3.0
Data lake operations toolkit16 versions - Latest release: 3 days ago - 1.18 thousand downloads last month - 1 stars on GitHub - 1 maintainer
sqlmesh-utils 0.1.1
Utilities for SQLMesh5 versions - Latest release: 6 months ago - 596 downloads last month - 2,304 stars on GitHub - 1 maintainer
aws-orbit-custom-cfn 1.4.0
Launch a CloudFormation stack for the team space19 versions - Latest release: over 4 years ago - 1 dependent repositories - 340 downloads last month - 147 stars on GitHub - 3 maintainers
projectoneflow 1.0.0
This Project is data engineering framework which implements all data engineering ingestion patterns1 version - Latest release: 3 months ago - 7 downloads last month - 1 stars on GitHub - 1 maintainer
projectoneflow-framework 1.0.0
This Project is wrapper for projectoneflow ingestion patterns1 version - Latest release: 3 months ago - 6 downloads last month - 1 stars on GitHub - 1 maintainer
collate-data-diff 0.11.10
Command-line tool and Python library to efficiently diff rows across two different databases.8 versions - Latest release: about 1 month ago - 613 thousand downloads last month - 2,988 stars on GitHub - 3 maintainers
sparkdataset 1.0.0
Provides instant access to many popular datasets right from Pyspark (in dataframe structure).1 version - Latest release: over 4 years ago - 1 dependent repositories - 20 downloads last month - 34 stars on GitHub - 1 maintainer
grai_source_mysql 0.1.1
19 versions - Latest release: over 2 years ago - 109 downloads last month - 310 stars on GitHub - 2 maintainersgrai-source-openlineage 0.1.0a1
1 version - Latest release: over 2 years ago - 11 downloads last month - 312 stars on GitHub - 2 maintainersnormandy 0.2.3
A data pipeline framework.5 versions - Latest release: over 4 years ago - 1 dependent repositories - 19 downloads last month - 0 stars on GitHub - 1 maintainer
dq-tester 0.4.0
A lightweight data quality testing tool for CSV files and databases3 versions - Latest release: 5 months ago - 24 downloads last month - 6 stars on GitHub - 1 maintainer
Top 4.5% on pypi.org
75 versions - Latest release: almost 2 years ago - 1 dependent package - 2 dependent repositories - 40.6 thousand downloads last month - 2,989 stars on GitHub - 4 maintainers
data-diff 0.11.2
Command-line tool and Python library to efficiently diff rows across two different databases.75 versions - Latest release: almost 2 years ago - 1 dependent package - 2 dependent repositories - 40.6 thousand downloads last month - 2,989 stars on GitHub - 4 maintainers
prodmodel 0.4.3
Build data science pipelines and models25 versions - Latest release: over 6 years ago - 1 dependent repositories - 239 downloads last month - 58 stars on GitHub - 1 maintainer
sqlmesh-cube 0.1.1.dev3
SQLMesh extension for generating Cube semantic layer configurations5 versions - Latest release: about 1 year ago - 77 downloads last month - 2,695 stars on GitHub - 1 maintainer
Top 7.3% on pypi.org
2,285 versions - Latest release: 20 days ago - 1 dependent package - 1 dependent repositories - 419 thousand downloads last month - 2,802 stars on GitHub - 7 maintainers
sqlmesh 0.230.1
Next-generation data transformation framework2,285 versions - Latest release: 20 days ago - 1 dependent package - 1 dependent repositories - 419 thousand downloads last month - 2,802 stars on GitHub - 7 maintainers
lingualeo-sqlmesh 0.0.2.dev1
Scalable and efficient data transformation framework - backwards compatible with dbt.1 version - Latest release: over 1 year ago - 21 downloads last month - 2,720 stars on GitHub - 1 maintainer
datu-core 0.0.3
LLM-Driven Data Transformations4 versions - Latest release: 6 months ago - 89 downloads last month - 41 stars on GitHub - 2 maintainers
grai-cli 0.2.7
20 versions - Latest release: over 1 year ago - 70 downloads last month - 310 stars on GitHub - 2 maintainersidg-metadata-client 1.0.2.0 💰
Ingestion Framework for OpenMetadata1 version - Latest release: over 2 years ago - 12 downloads last month - 7,743 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
12 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 378 downloads last month - 7,743 stars on GitHub - 1 maintainer
openmetadata-ingestion-core 0.10.0 💰
These are the generated Python classes from JSON Schema12 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 378 downloads last month - 7,743 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
522 versions - Latest release: almost 4 years ago - 3 dependent packages - 2 dependent repositories - 308 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-ingestion 0.10.1
Ingestion Framework for OpenMetadata522 versions - Latest release: almost 4 years ago - 3 dependent packages - 2 dependent repositories - 308 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
372 versions - Latest release: 16 days ago - 36.2 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-managed-apis 1.12.0.0
Airflow REST APIs to create and manage DAGS372 versions - Latest release: 16 days ago - 36.2 thousand downloads last month - 5,446 stars on GitHub - 1 maintainer
openmetadata-sqlalchemy-bigquery 1.2.0
SQLAlchemy dialect for BigQuery by OpenMetadata4 versions - Latest release: over 4 years ago - 1 dependent package - 1 dependent repositories - 44 downloads last month - 4,168 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
31 versions - Latest release: almost 4 years ago - 1 dependent repositories - 1.03 thousand downloads last month - 3,365 stars on GitHub - 1 maintainer
openmetadata-airflow-managed-apis 0.10.1
Airflow REST APIs to create and manage DAGS31 versions - Latest release: almost 4 years ago - 1 dependent repositories - 1.03 thousand downloads last month - 3,365 stars on GitHub - 1 maintainer
grai_schemas 0.2.11
61 versions - Latest release: over 2 years ago - 212 downloads last month - 310 stars on GitHub - 2 maintainersgrai_source_snowflake 0.1.2
29 versions - Latest release: over 2 years ago - 91 downloads last month - 310 stars on GitHub - 2 maintainersdata-diff-customize 1.0.3
Command-line tool and Python library to efficiently diff rows across two different databases.5 versions - Latest release: over 2 years ago - 62 downloads last month - 2,946 stars on GitHub - 1 maintainer
grai_source_fivetran 0.1.2
18 versions - Latest release: over 2 years ago - 155 downloads last month - 313 stars on GitHub - 2 maintainerspytzen 1.1.6 💰
DATA LAB93 versions - Latest release: over 1 year ago - 1 dependent repositories - 110 downloads last month - 1 stars on GitHub - 1 maintainer
camelcasing 0.1.3
Converts PascalCase or snake_case strings to camelCase.13 versions - Latest release: about 3 years ago - 1.34 thousand downloads last month - 2 stars on GitHub - 1 maintainer
grai_source_dbt 0.3.5
42 versions - Latest release: about 2 years ago - 155 downloads last month - 310 stars on GitHub - 2 maintainersaws-orbit-voila 1.4.0
Launch a Pod for the team space that executes a script given by the user11 versions - Latest release: over 4 years ago - 1 dependent repositories - 305 downloads last month - 147 stars on GitHub - 3 maintainers
aws-orbit-overprovisioning 1.4.0
Launch a Pod for the team space that executes a script given by the user13 versions - Latest release: over 4 years ago - 1 dependent repositories - 427 downloads last month - 147 stars on GitHub - 3 maintainers
Top 7.3% on pypi.org
15 versions - Latest release: about 3 years ago - 3 dependent repositories - 119 downloads last month - 270 stars on GitHub - 2 maintainers
aws-ddk 0.6.2
AWS DataOps Development Kit - CLI15 versions - Latest release: about 3 years ago - 3 dependent repositories - 119 downloads last month - 270 stars on GitHub - 2 maintainers
grai_source_mssql 0.1.3
21 versions - Latest release: about 2 years ago - 130 downloads last month - 310 stars on GitHub - 2 maintainers
Top 9.6% on pypi.org
grai-source-postgres 0.2.4
30 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 135 downloads last month - 310 stars on GitHub - 2 maintainers
Top 8.8% on pypi.org
4 versions - Latest release: 8 months ago - 1 dependent repositories - 2.38 thousand downloads last month - 1,093 stars on GitHub - 1 maintainer
zingg 0.5.0
Zingg Entity Resolution, Data Mastering and Deduplication4 versions - Latest release: 8 months ago - 1 dependent repositories - 2.38 thousand downloads last month - 1,093 stars on GitHub - 1 maintainer
grai_source_dbt_cloud 0.1.5
18 versions - Latest release: about 2 years ago - 176 downloads last month - 310 stars on GitHub - 2 maintainerscz-data-diff 0.0.4
Command-line tool and Python library to efficiently diff rows across two different databases.3 versions - Latest release: about 2 years ago - 45 downloads last month - 2,988 stars on GitHub - 2 maintainers
Top 7.0% on pypi.org
23 versions - Latest release: 12 months ago - 3 dependent repositories - 797 downloads last month - 270 stars on GitHub - 3 maintainers
aws-ddk-core 1.4.1
The AWS DataOps Development Kit is an open source development framework for customers that build ...23 versions - Latest release: 12 months ago - 3 dependent repositories - 797 downloads last month - 270 stars on GitHub - 3 maintainers
aws-orbit-sm-operator 1.4.0
Launch a Pod for the team space that executes a script given by the user11 versions - Latest release: over 4 years ago - 1 dependent repositories - 338 downloads last month - 147 stars on GitHub - 3 maintainers
clarifai-datautils 0.0.7
Clarifai Data Utils7 versions - Latest release: about 1 year ago - 79 downloads last month - 7 stars on GitHub - 1 maintainer
aws-orbit 1.4.0
Data & ML Unified Development and Production Environment.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 88 downloads last month - 147 stars on GitHub - 3 maintainers
sysxtract 1.0.0
Extract logs based off events from sysmon. Comes as a package, cli and ui.1 version - Latest release: almost 6 years ago - 1 dependent repositories - 11 downloads last month - 3 stars on GitHub - 1 maintainer
dftools-core 0.1.1
Data Flooder Tools - Core Package18 versions - Latest release: about 2 years ago - 1 dependent package - 125 downloads last month - 1 maintainer
aws-orbit-ray 1.4.0
Launch a Pod for the team space that executes a script given by the user13 versions - Latest release: over 4 years ago - 1 dependent repositories - 96 downloads last month - 147 stars on GitHub - 3 maintainers
aws-orbit-lustre 1.4.0
Add support for FSX Lustre for high-performance file system13 versions - Latest release: over 4 years ago - 1 dependent repositories - 115 downloads last month - 147 stars on GitHub - 3 maintainers
recohut 0.0.11
A python library for building recommender systems.4 versions - Latest release: about 4 years ago - 1 dependent repositories - 40 downloads last month - 14 stars on GitHub - 1 maintainer
aws-orbit-sdk 1.4.0
AWS Orbit Workbench SDK19 versions - Latest release: over 4 years ago - 1 dependent repositories - 483 downloads last month - 147 stars on GitHub - 3 maintainers
etlworkers 0.0.6
A Data Engineering package6 versions - Latest release: over 4 years ago - 1 dependent repositories - 79 downloads last month - 2 stars on GitHub - 1 maintainer
aws-orbit-redshift 1.4.0
Orbit Workbench Redshift Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 145 downloads last month - 147 stars on GitHub - 3 maintainers
zenopy 2022.10.29
zenopy: A Python wrapper package for Zenodo API3 versions - Latest release: over 3 years ago - 1 dependent repositories - 12 downloads last month - 7 stars on GitHub - 1 maintainer
metadata-guardian 0.3.0
MetadataGuardian is used to protect data by searching the source metadata.13 versions - Latest release: about 1 year ago - 1 dependent repositories - 205 downloads last month - 18 stars on GitHub - 1 maintainer
livyc 0.0.14
Apache Livy Client11 versions - Latest release: over 3 years ago - 62 downloads last month - 3 stars on GitHub - 1 maintainer
grai_source_redshift 0.1.1
18 versions - Latest release: over 2 years ago - 121 downloads last month - 313 stars on GitHub - 2 maintainers
Top 6.1% on pypi.org
grai-client 0.3.5
61 versions - Latest release: over 2 years ago - 16 dependent packages - 3 dependent repositories - 378 downloads last month - 313 stars on GitHub - 2 maintainersaws-orbit-code-commit 1.4.0
Orbit Workbench CodeCommit Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 430 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_cube 0.0.2
4 versions - Latest release: almost 2 years ago - 60 downloads last month - 313 stars on GitHub - 1 maintaineraws-orbit-jupyterlab-orbit 1.4.0
AWS Orbit Workbench JupyterLab extension.11 versions - Latest release: over 4 years ago - 1 dependent repositories - 77 downloads last month - 147 stars on GitHub - 3 maintainers
grai_source_bigquery 0.2.4
29 versions - Latest release: over 2 years ago - 239 downloads last month - 313 stars on GitHub - 2 maintainersthe-guide 0.1.36
1 version - Latest release: over 2 years ago - 13 downloads last month - 310 stars on GitHub - 2 maintainersgrai_source_looker 0.0.3
11 versions - Latest release: over 2 years ago - 76 downloads last month - 310 stars on GitHub - 2 maintainersgrai_source_flat_file 0.2.2
18 versions - Latest release: over 2 years ago - 151 downloads last month - 310 stars on GitHub - 2 maintainersrawbuilder 0.0.7
an elegant datasets factory7 versions - Latest release: about 4 years ago - 1 dependent repositories - 52 downloads last month - 6 stars on GitHub - 1 maintainer
athenasql 0.1.1
SQL builder for AWS Athena, inspired by sparkSQL14 versions - Latest release: over 1 year ago - 271 downloads last month - 6 stars on GitHub - 1 maintainer
aws-orbit-emr-on-eks 1.4.0
Allow users to run EMR jobs on their EKS namespace18 versions - Latest release: over 4 years ago - 1 dependent repositories - 320 downloads last month - 147 stars on GitHub - 3 maintainers
aws-orbit-team-script-launcher 1.4.0
Launch a Pod for the team space that executes a script given by the user19 versions - Latest release: over 4 years ago - 1 dependent repositories - 283 downloads last month - 147 stars on GitHub - 3 maintainers
dftools-snowflake 0.1.1
Data Flooder Tools - Snowflake Package7 versions - Latest release: about 2 years ago - 57 downloads last month - 1 maintainer
kedro-static-viz 0.4.4 💰
Creates a static visualization of your pipeline12 versions - Latest release: about 5 years ago - 1 dependent repositories - 19 downloads last month - 28 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
grai-graph 0.2.5
24 versions - Latest release: over 2 years ago - 2 dependent packages - 1 dependent repositories - 123 downloads last month - 310 stars on GitHub - 2 maintainersaws-orbit-hello-world 1.4.0
Minimal Orbit Workbench Plugin.19 versions - Latest release: over 4 years ago - 1 dependent repositories - 292 downloads last month - 147 stars on GitHub - 3 maintainers
Related Keywords
python
36
data
35
redshift
32
snowflake
30
dbt
30
data-science
29
data-lineage
24
mysql
22
hacktoberfest
22
postgresql
22
parquet
18
datalineage
18
django
18
fivetran
18
mssql
18
open-source
18
aws
17
datalake
16
analytics
16
data-analysis
15
workbench
14
orbit-workbench
14
mach
14
kubernetes
14
jupyterhub
14
jupyter
14
gpu
14
eks-cluster
14
eks
14
cdk
12
dataquality
12
data-quality
10
sql
9
datascience
8
metadata-management
7
metadata
7
data-catalog
6
data-contracts
6
data-discovery
6
datadiscovery
6
data-validation
6
data-quality-checks
6
database
6
dataops
6
data-governance
6
data-profiling
6
data-observability
6
trino
5
data-engineering
5
etl
5
data-collaboration
5
elt
4
transformation
4
spark
4
pyspark
4
data-diffing
4
data-quality-monitoring
4
rdbms
4
postgres
4
oracle-database
4
databricks-sql
4
mcp
3
bigquery
2
datacatalog
2
datapipeline
2
dftools
2
dataflooder
2
mcp-server
2
machinelearning
2
projectoneflow
2
delta-lake
2
Python
2
datasets
2
dataset
2
df
2
dataanalysis
1
ingestion
1
ingestion-pipeline
1
threathunting
1
threat-intelligence
1
unstructured-data
1
unstructured-data-analysis
1
unstructured-image
1
recsys
1
unstructured-text
1
security
1
infosec
1
analysis
1
deeplearning
1
bigdata
1
molssi-best-practices
1
streamlit
1
sysmon
1
molssi
1
chemai
1
zenopy
1
training
1
bootcamp
1
kedro-plugin
1
kedro
1