pypi.org "PySpark" keyword
Top 5.4% on pypi.org
23 versions - Latest release: almost 2 years ago - 1 dependent package - 3 dependent repositories - 18.4 thousand downloads last month - 2 maintainers
td-pyspark 24.4.1
Treasure Data extension for pyspark23 versions - Latest release: almost 2 years ago - 1 dependent package - 3 dependent repositories - 18.4 thousand downloads last month - 2 maintainers
join-skew-data-p3 0.7
customize partitions with skew data for join2 versions - Latest release: about 5 years ago - 1 dependent repositories - 24 downloads last month - 3 stars on GitHub - 1 maintainer
pyspark-delta-scd2 0.4.1 💰
This project utilizes faker-pyspark to generate random schema and dataframes to mimic data table ...6 versions - Latest release: over 2 years ago - 51 downloads last month - 0 stars on GitHub - 1 maintainer
faker-pyspark 0.8.0
faker-pyspark is a PySpark DataFrame and Schema provider for the Faker python package7 versions - Latest release: over 2 years ago - 1 dependent package - 115 downloads last month - 1 stars on GitHub - 1 maintainer
spark-pit 0.5.0
PIT join library for PySpark8 versions - Latest release: almost 4 years ago - 1 dependent repositories - 54 downloads last month - 30 stars on GitHub - 1 maintainer
karadoc 0.0.3
Karadoc is a data engineering Python framework built on top of PySpark that simplifies ETL/ELT1 version - Latest release: over 3 years ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
annodize 0.2.0
"Python Annotations that are shockingly useful!"1 version - Latest release: almost 4 years ago - 1 dependent repositories - 21 downloads last month - 0 stars on GitHub - 1 maintainer
spark-frame 0.5.2
A library containing various utility functions for playing with PySpark DataFrames13 versions - Latest release: over 1 year ago - 1.81 thousand downloads last month - 11 stars on GitHub - 1 maintainer
smartframes 1.1.0
Enhanced Python Dataframes for Spark/PySpark3 versions - Latest release: over 10 years ago - 2 dependent repositories - 8 downloads last month - 6 stars on GitHub - 1 maintainer
td-pyspark-ea 20.12.0
Treasure Data extension for pyspark4 versions - Latest release: over 5 years ago - 1 dependent repositories - 104 downloads last month - 2 maintainers
spark-lean 0.3.3
An interactive PySpark-based Data Cleaning Library4 versions - Latest release: almost 8 years ago - 1 dependent repositories - 23 downloads last month - 7 stars on GitHub - 2 maintainers
rstudio-spark-install 0.8.0
Utility to setup various versions of Apache Spark on multiple platforms.1 version - Latest release: over 8 years ago - 1 dependent repositories - 7 downloads last month - 16 stars on GitHub - 1 maintainer
join-skew-data 0.5
customize partitions with skew data for join4 versions - Latest release: about 7 years ago - 1 dependent repositories - 453 downloads last month - 3 stars on GitHub - 1 maintainer
azure-devops-pyspark 1.0.5
A productive library to extract data from Azure Devops and apply agile metrics.32 versions - Latest release: over 3 years ago - 96 downloads last month - 1 stars on GitHub - 1 maintainer
vee-ap-generation-pyspark-app 1.0.1 removed
PySpark App on PIE Spark1 version - Latest release: about 3 years ago - 1 maintainer