Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
pypi.org "spark" keyword
Top 1.4% on pypi.org
52 versions - Latest release: 6 months ago - 29 dependent packages - 251 dependent repositories - 689 thousand downloads last month - 723 stars on GitHub - 13 maintainers
impyla 0.19.0
Python client for the Impala distributed query engine52 versions - Latest release: 6 months ago - 29 dependent packages - 251 dependent repositories - 689 thousand downloads last month - 723 stars on GitHub - 13 maintainers
recordflux 0.21.0
A toolset for the formal specification and generation of verifiable binary parsers, message gener...23 versions - Latest release: 25 days ago - 1 dependent package - 1 dependent repositories - 304 downloads last month - 101 stars on GitHub - 3 maintainers
Top 5.5% on pypi.org
10 versions - Latest release: about 4 years ago - 4 dependent repositories - 439 thousand downloads last month - 286 stars on GitHub - 1 maintainer
sk-dist 0.1.9
Distributed scikit-learn meta-estimators with PySpark10 versions - Latest release: about 4 years ago - 4 dependent repositories - 439 thousand downloads last month - 286 stars on GitHub - 1 maintainer
spark-yarn-submit 1.0.0
library to handle spark job submit in a yarn cluster in different environment1 version - Latest release: over 7 years ago - 1 dependent repositories - 12 downloads last month - 3 stars on GitHub - 1 maintainer
tinsel 0.3.0
PySpark schema generator3 versions - Latest release: over 5 years ago - 1 dependent repositories - 214 thousand downloads last month - 1 maintainer
Top 8.7% on pypi.org
6 versions - Latest release: over 5 years ago - 4 dependent repositories - 90 downloads last month - 13,453 stars on GitHub - 3 maintainers
jumpy 0.2.4
Numpy and nd4j interop6 versions - Latest release: over 5 years ago - 4 dependent repositories - 90 downloads last month - 13,453 stars on GitHub - 3 maintainers
pydatavec 0.1.2
Python interface for DataVec2 versions - Latest release: over 4 years ago - 1 dependent package - 1 dependent repositories - 37 downloads last month - 13,290 stars on GitHub - 1 maintainer
lython 1.0
Lisp dialect compiler to Python byte-code1 version - Latest release: 9 months ago - 1 dependent repositories - 1 maintainer
Top 1.2% on pypi.org
47 versions - Latest release: over 2 years ago - 11 dependent packages - 444 dependent repositories - 2.24 million downloads last month - 3,308 stars on GitHub - 7 maintainers
koalas 1.8.2
Koalas: pandas API on Apache Spark47 versions - Latest release: over 2 years ago - 11 dependent packages - 444 dependent repositories - 2.24 million downloads last month - 3,308 stars on GitHub - 7 maintainers
bigdl-llm 2.4.0
Large Language Model Develop Toolkit332 versions - Latest release: 6 months ago - 24.6 thousand downloads last month - 4,693 stars on GitHub - 1 maintainer
aj-zsl-nlu 4.2.0
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained mod...1 version - Latest release: 10 months ago - 30 downloads last month - 821 stars on GitHub - 1 maintainer
nlu-by-samed 5.1.4
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained mod...2 versions - Latest release: 3 months ago - 19 downloads last month - 821 stars on GitHub - 1 maintainer
shailesh-text-gen 4.2.1
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained mod...1 version - Latest release: 10 months ago - 19 downloads last month - 821 stars on GitHub - 1 maintainer
nlu-ocr-shailesh 5.0.0
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained mod...1 version - Latest release: 7 months ago - 20 downloads last month - 821 stars on GitHub - 1 maintainer
shailesh-bart 4.2.1
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained mod...1 version - Latest release: 10 months ago - 22 downloads last month - 821 stars on GitHub - 1 maintainer
nlu-by-ckl 5.0.2rc1
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained mod...15 versions - Latest release: 8 months ago - 2 dependent packages - 154 downloads last month - 821 stars on GitHub - 1 maintainer
table-extractor-new 5.1.0
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained mod...1 version - Latest release: 5 months ago - 28 downloads last month - 821 stars on GitHub - 1 maintainer
nlu-spark23 1.1.1rc2
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with hundreds of pretrained m...1 version - Latest release: over 3 years ago - 1 dependent repositories - 20 downloads last month - 821 stars on GitHub - 2 maintainers
Top 2.9% on pypi.org
127 versions - Latest release: 18 days ago - 13 dependent packages - 8 dependent repositories - 18.8 thousand downloads last month - 821 stars on GitHub - 2 maintainers
nlu 5.3.1
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 20000+ of pretrained mod...127 versions - Latest release: 18 days ago - 13 dependent packages - 8 dependent repositories - 18.8 thousand downloads last month - 821 stars on GitHub - 2 maintainers
shailesh 4.2.1
John Snow Labs NLU provides state of the art algorithms for NLP&NLU with 10000+ of pretrained mod...1 version - Latest release: 10 months ago - 18 downloads last month - 814 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
211 versions - Latest release: 11 days ago - 15 dependent packages - 64 dependent repositories - 1.02 million downloads last month - 37,738 stars on GitHub - 19 maintainers
databricks-connect 14.3.2
Databricks Connect Client211 versions - Latest release: 11 days ago - 15 dependent packages - 64 dependent repositories - 1.02 million downloads last month - 37,738 stars on GitHub - 19 maintainers
freeza-offset 1.0.10
Spark stream consumption commit in kafka consumer group10 versions - Latest release: almost 4 years ago - 1 dependent repositories - 964 downloads last month - 14 stars on GitHub - 1 maintainer
Top 8.9% on pypi.org
30 versions - Latest release: over 1 year ago - 1 dependent repositories - 31.4 thousand downloads last month - 242 stars on GitHub - 1 maintainer
pyspark-hnsw 1.1.0
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World gr...30 versions - Latest release: over 1 year ago - 1 dependent repositories - 31.4 thousand downloads last month - 242 stars on GitHub - 1 maintainer
pyspark-hyperloglog 2.1.1
PySpark UDFs for HyperLogLog1 version - Latest release: almost 6 years ago - 3 dependent repositories - 64 downloads last month - 6 stars on GitHub - 1 maintainer
Top 5.7% on pypi.org
13 versions - Latest release: 3 months ago - 1 dependent package - 2 dependent repositories - 173 thousand downloads last month - 268 stars on GitHub - 1 maintainer
dbldatagen 0.3.6
Databricks Labs - PySpark Synthetic Data Generator13 versions - Latest release: 3 months ago - 1 dependent package - 2 dependent repositories - 173 thousand downloads last month - 268 stars on GitHub - 1 maintainer
sparkdantic 0.20.5
A pydantic -> spark schema library30 versions - Latest release: 2 months ago - 32.9 thousand downloads last month - 268 stars on GitHub - 1 maintainer
Top 0.9% on pypi.org
19 versions - Latest release: 10 days ago - 38 dependent packages - 90 dependent repositories - 11.2 million downloads last month - 6,958 stars on GitHub - 6 maintainers
delta-spark 3.2.0
Python APIs for using Delta Lake with Apache Spark19 versions - Latest release: 10 days ago - 38 dependent packages - 90 dependent repositories - 11.2 million downloads last month - 6,958 stars on GitHub - 6 maintainers
jupyterlab-spark-ui-tab 0.0.14
Spark UI extension for jupyterlab9 versions - Latest release: about 5 years ago - 1 dependent repositories - 91 downloads last month - 8 stars on GitHub - 1 maintainer
sparkmonitor-s 0.0.22
Spark Monitor Extension for Jupyter Notebook48 versions - Latest release: almost 3 years ago - 1 dependent repositories - 250 downloads last month - 172 stars on GitHub - 2 maintainers
Top 3.9% on pypi.org
129 versions - Latest release: about 2 months ago - 3 dependent packages - 4 dependent repositories - 108 thousand downloads last month - 1,072 stars on GitHub - 4 maintainers
splink 3.9.14
Fast probabilistic data linkage at scale129 versions - Latest release: about 2 months ago - 3 dependent packages - 4 dependent repositories - 108 thousand downloads last month - 1,072 stars on GitHub - 4 maintainers
ppextensions 0.0.6
PPExtenions - Set of iPython and Jupyter extensions5 versions - Latest release: almost 5 years ago - 1 dependent repositories - 62 downloads last month - 50 stars on GitHub - 7 maintainers
spark-ml-utils 0.0.3
Some spark ml utilities, for easy checking/modifying spark pipeline, extracting feature importanc...3 versions - Latest release: over 2 years ago - 1 dependent repositories - 135 downloads last month - 1 stars on GitHub - 1 maintainer
dummy_spark 0.0.1
A pure python mocked version of pyspark's rdd class1 version - Latest release: almost 8 years ago - 32 downloads last month - 27 stars on GitHub - 1 maintainer
sqlglot-doris 1.1.9
An easily customizable SQL parser and transpiler42 versions - Latest release: 9 days ago - 513 downloads last month - 4,253 stars on GitHub - 1 maintainer
pathling 6.4.2
Python API for Pathling44 versions - Latest release: 5 months ago - 1 dependent repositories - 781 downloads last month - 78 stars on GitHub - 1 maintainer
nessiedemo 0.0.19
Project Nessie Demos Helper4 versions - Latest release: almost 3 years ago - 1 dependent repositories - 41 downloads last month - 824 stars on GitHub - 5 maintainers
Top 1.4% on pypi.org
141 versions - Latest release: about 1 month ago - 35 dependent packages - 35 dependent repositories - 4.1 million downloads last month - 3,715 stars on GitHub - 3 maintainers
spark-nlp 5.3.3
John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML...141 versions - Latest release: about 1 month ago - 35 dependent packages - 35 dependent repositories - 4.1 million downloads last month - 3,715 stars on GitHub - 3 maintainers
Top 10.0% on pypi.org
1 version - Latest release: over 1 year ago - 85 downloads last month - 3,066 stars on GitHub - 1 maintainer
ckls-test-lib 4.2.7 removed
John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML...1 version - Latest release: over 1 year ago - 85 downloads last month - 3,066 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
32 versions - Latest release: over 1 year ago - 1 dependent repositories - 371 downloads last month - 1,441 stars on GitHub - 2 maintainers
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.32 versions - Latest release: over 1 year ago - 1 dependent repositories - 371 downloads last month - 1,441 stars on GitHub - 2 maintainers
Top 5.2% on pypi.org
141 versions - Latest release: 13 days ago - 3 dependent repositories - 22 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-deck-standard 1.12.0
This Plugin provides more renderers to improve task visibility141 versions - Latest release: 13 days ago - 3 dependent repositories - 22 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
104 versions - Latest release: 13 days ago - 1 dependent repositories - 5.37 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-whylogs 1.12.0
Enable the use of whylogs profiles to be used in flyte tasks to get aggregate statistics about data.104 versions - Latest release: 13 days ago - 1 dependent repositories - 5.37 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 8.4% on pypi.org
201 versions - Latest release: 13 days ago - 1 dependent repositories - 2.42 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-data-fsspec 1.12.0
This is a deprecated plugin as of flytekit 1.5201 versions - Latest release: 13 days ago - 1 dependent repositories - 2.42 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
96 versions - Latest release: 13 days ago - 1 dependent repositories - 8.43 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-huggingface 1.12.0
Hugging Face plugin for flytekit96 versions - Latest release: 13 days ago - 1 dependent repositories - 8.43 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 2.4% on pypi.org
376 versions - Latest release: 13 days ago - 53 dependent packages - 41 dependent repositories - 290 thousand downloads last month - 200 stars on GitHub - 7 maintainers
flytekit 1.12.0
Flyte SDK for Python376 versions - Latest release: 13 days ago - 53 dependent packages - 41 dependent repositories - 290 thousand downloads last month - 200 stars on GitHub - 7 maintainers
Top 8.9% on pypi.org
96 versions - Latest release: 13 days ago - 1 dependent repositories - 2.77 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-dbt 1.12.0
DBT Plugin for Flytekit96 versions - Latest release: 13 days ago - 1 dependent repositories - 2.77 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-identity-aware-proxy 1.12.0
External command plugin to generate ID tokens for GCP Identity Aware Proxy31 versions - Latest release: 13 days ago - 3.27 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 8.5% on pypi.org
69 versions - Latest release: 13 days ago - 1 dependent repositories - 2.89 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-dask 1.12.0
Dask plugin for flytekit69 versions - Latest release: 13 days ago - 1 dependent repositories - 2.89 thousand downloads last month - 200 stars on GitHub - 1 maintainer
dummyrdd 0.1.2
A pure python mocked version of pyspark's rdd class11 versions - Latest release: almost 7 years ago - 1 dependent repositories - 51 downloads last month - 27 stars on GitHub - 1 maintainer
flytekitplugins-pydantic 1.12.0
Plugin adding type support for Pydantic models32 versions - Latest release: 13 days ago - 3.71 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
15 versions - Latest release: about 1 year ago - 1 dependent repositories - 4.58 thousand downloads last month - 205 stars on GitHub - 2 maintainers
pyrasterframes 0.11.1
Access and process geospatial raster data in PySpark DataFrames15 versions - Latest release: about 1 year ago - 1 dependent repositories - 4.58 thousand downloads last month - 205 stars on GitHub - 2 maintainers
gor-pyspark 3.22.6
Python helper function for gor-spark13 versions - Latest release: almost 2 years ago - 1 dependent repositories - 129 downloads last month - 0 stars on GitHub - 1 maintainer
Top 2.8% on pypi.org
28 versions - Latest release: 3 months ago - 6 dependent packages - 722 dependent repositories - 1.75 million downloads last month - 198 stars on GitHub - 2 maintainers
visions 0.7.6
Visions28 versions - Latest release: 3 months ago - 6 dependent packages - 722 dependent repositories - 1.75 million downloads last month - 198 stars on GitHub - 2 maintainers
seipy 1.3.2
Helper functions for data science4 versions - Latest release: almost 6 years ago - 1 dependent repositories - 101 downloads last month - 4 stars on GitHub - 1 maintainer
Top 5.7% on pypi.org
79 versions - Latest release: 3 months ago - 8 dependent repositories - 2.16 thousand downloads last month - 824 stars on GitHub - 6 maintainers
pynessie 0.67.0
Project Nessie: Transactional Catalog for Data Lakes with Git-like semantics79 versions - Latest release: 3 months ago - 8 dependent repositories - 2.16 thousand downloads last month - 824 stars on GitHub - 6 maintainers
Top 5.4% on pypi.org
163 versions - Latest release: 16 days ago - 1 dependent package - 15 dependent repositories - 10.8 thousand downloads last month - 51 stars on GitHub - 1 maintainer
hsfs 3.7.6
HSFS: An environment independent client to interact with the Hopsworks Featurestore163 versions - Latest release: 16 days ago - 1 dependent package - 15 dependent repositories - 10.8 thousand downloads last month - 51 stars on GitHub - 1 maintainer
pramen-py 1.8.8
Pramen transformations written in python30 versions - Latest release: 3 days ago - 451 downloads last month - 22 stars on GitHub - 3 maintainers
glue-utils 0.4.0
Reusable utilities for working with Glue PySpark jobs19 versions - Latest release: 3 days ago - 2.57 thousand downloads last month - 1 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
16 versions - Latest release: about 1 month ago - 2 dependent packages - 3 dependent repositories - 233 thousand downloads last month - 4,981 stars on GitHub - 1 maintainer
synapseml 1.0.4
Synapse Machine Learning16 versions - Latest release: about 1 month ago - 2 dependent packages - 3 dependent repositories - 233 thousand downloads last month - 4,981 stars on GitHub - 1 maintainer
Top 1.2% on pypi.org
521 versions - Latest release: 15 days ago - 104 dependent packages - 272 dependent repositories - 2.86 million downloads last month - 5,389 stars on GitHub - 1 maintainer
sqlglot 23.13.1
An easily customizable SQL parser and transpiler521 versions - Latest release: 15 days ago - 104 dependent packages - 272 dependent repositories - 2.86 million downloads last month - 5,389 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
114 versions - Latest release: 20 days ago - 19 dependent packages - 97 dependent repositories - 687 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
fugue 0.9.0
An abstraction layer for distributed computation114 versions - Latest release: 20 days ago - 19 dependent packages - 97 dependent repositories - 687 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
Top 9.6% on pypi.org
2 versions - Latest release: over 2 years ago - 5 dependent repositories - 169 thousand downloads last month - 21 stars on GitHub - 1 maintainer
pyspark-test 0.2.0
Check that left and right spark DataFrame are equal.2 versions - Latest release: over 2 years ago - 5 dependent repositories - 169 thousand downloads last month - 21 stars on GitHub - 1 maintainer
webexteamsarchiver 0.11.3
Room archiver utility for Webex Teams10 versions - Latest release: about 2 years ago - 1 dependent repositories - 93 downloads last month - 22 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
3 versions - Latest release: 5 months ago - 1 dependent repositories - 1.15 thousand downloads last month - 890 stars on GitHub - 1 maintainer
zingg 0.4.0
Zingg Entity Resolution, Data Mastering and Deduplication3 versions - Latest release: 5 months ago - 1 dependent repositories - 1.15 thousand downloads last month - 890 stars on GitHub - 1 maintainer
pymrgeo 1.0.2
MrGeo (pronounced "Mister Geo") is an open source geospatial toolkit designed to provide raster-b...3 versions - Latest release: almost 7 years ago - 1 dependent repositories - 23 downloads last month - 203 stars on GitHub - 1 maintainer
brewblox-devcon-spark 0.5.2
Communication with Spark controllers285 versions - Latest release: about 5 years ago - 1 dependent repositories - 1.55 thousand downloads last month - 3 stars on GitHub - 1 maintainer
Top 7.3% on pypi.org
44 versions - Latest release: almost 7 years ago - 2 dependent repositories - 576 downloads last month - 1,476 stars on GitHub - 1 maintainer
seldon 2.2.5
Seldon Python Utilities44 versions - Latest release: almost 7 years ago - 2 dependent repositories - 576 downloads last month - 1,476 stars on GitHub - 1 maintainer
Top 5.1% on pypi.org
2 versions - Latest release: about 6 years ago - 4 dependent repositories - 11.1 thousand downloads last month - 751 stars on GitHub - 1 maintainer
tensorframes 0.2.9
Integration tools for running deep learning on Spark2 versions - Latest release: about 6 years ago - 4 dependent repositories - 11.1 thousand downloads last month - 751 stars on GitHub - 1 maintainer
sparkorm 1.2.17
SparkORM: Python Spark SQL & DataFrame schema management and basic Object Relational Mapping.21 versions - Latest release: 4 days ago - 386 thousand downloads last month - 9 stars on GitHub - 1 maintainer
sparkflow 0.7.0
Deep learning on Spark with Tensorflow13 versions - Latest release: about 5 years ago - 1 dependent repositories - 362 downloads last month - 300 stars on GitHub - 1 maintainer
intel-optimization-for-horovod 0.5.0
Intel® Optimization for Horovod* is the distributed training framework for TensorFlow* and PyTorch*.8 versions - Latest release: about 1 year ago - 290 downloads last month - 4 stars on GitHub - 4 maintainers
Top 8.7% on pypi.org
15 versions - Latest release: 6 months ago - 1 dependent repositories - 1.31 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
fugue-sql-antlr-cpp 0.2.0
Fugue SQL Antlr C++ Parser15 versions - Latest release: 6 months ago - 1 dependent repositories - 1.31 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
aztk 0.10.3
On-demand, Dockerized, Spark Jobs on Azure (powered by Azure Batch)20 versions - Latest release: over 4 years ago - 1 dependent repositories - 276 downloads last month - 151 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
1 version - Latest release: over 5 years ago - 3 dependent packages - 36 dependent repositories - 2.09 million downloads last month - 971 stars on GitHub - 1 maintainer
graphframes 0.6
GraphFrames: DataFrame-based Graphs1 version - Latest release: over 5 years ago - 3 dependent packages - 36 dependent repositories - 2.09 million downloads last month - 971 stars on GitHub - 1 maintainer
Top 2.3% on pypi.org
16 versions - Latest release: 6 months ago - 2 dependent packages - 110 dependent repositories - 433 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
fugue-sql-antlr 0.2.0
Fugue SQL Antlr Parser16 versions - Latest release: 6 months ago - 2 dependent packages - 110 dependent repositories - 433 thousand downloads last month - 1,866 stars on GitHub - 2 maintainers
spark-submit 1.4.0
Python manager for spark-submit jobs7 versions - Latest release: about 1 year ago - 1 dependent repositories - 1.29 thousand downloads last month - 10 stars on GitHub - 1 maintainer
cspark-python 0.0.13
Python library for Cisco Spark4 versions - Latest release: about 7 years ago - 1 dependent repositories - 22 downloads last month - 0 stars on GitHub - 1 maintainer
webexsdk 2.0.5
Community-developed Python SDK for the Webex Teams APIs6 versions - Latest release: 6 months ago - 64 downloads last month - 1 maintainer
lofn 0.4.0
Lightweight Orchestration For Now: Wrapper for serial tools using Spark and Docker to parallelize1 version - Latest release: over 6 years ago - 1 dependent repositories - 21 downloads last month - 2 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
12 versions - Latest release: almost 2 years ago - 11 dependent packages - 1,467 dependent repositories - 265 thousand downloads last month - 229 stars on GitHub - 2 maintainers
webexteamssdk 1.6.1
Community-developed Python SDK for the Webex Teams APIs12 versions - Latest release: almost 2 years ago - 11 dependent packages - 1,467 dependent repositories - 265 thousand downloads last month - 229 stars on GitHub - 2 maintainers
typedspark 1.4.2
Column-wise type annotations for pyspark DataFrames38 versions - Latest release: 19 days ago - 1 dependent package - 11.9 thousand downloads last month - 52 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
15 versions - Latest release: 6 months ago - 3 dependent packages - 61 dependent repositories - 158 thousand downloads last month - 1,494 stars on GitHub - 2 maintainers
mleap 0.23.1
MLeap Python API15 versions - Latest release: 6 months ago - 3 dependent packages - 61 dependent repositories - 158 thousand downloads last month - 1,494 stars on GitHub - 2 maintainers
pyspark-data-sources 0.1.2
Custom Spark data sources for reading and writing data in Apache Spark, using the Python Data Sou...3 versions - Latest release: 3 months ago - 47 downloads last month - 38,255 stars on GitHub - 1 maintainer
starlake-orchestration 0.1.2
Starlake Python Distribution For orchestration6 versions - Latest release: 3 months ago - 2 dependent packages - 58 downloads last month - 31 stars on GitHub - 1 maintainer
starlake-dagster 0.1.2
Starlake Python Distribution For Dagster5 versions - Latest release: 3 months ago - 1 dependent package - 45 downloads last month - 31 stars on GitHub - 1 maintainer
starlake-airflow 0.1.2
Starlake Python Distribution For Airflow23 versions - Latest release: 3 months ago - 1 dependent package - 163 downloads last month - 31 stars on GitHub - 1 maintainer
spark-dataframe-tools 0.6.7
spark_dataframe_tools17 versions - Latest release: about 1 month ago - 12 dependent packages - 351 downloads last month - 1 maintainer
h2o-mlflow-flavor 0.1.0
A mlflow flavor for working with H2O-3 MOJO and POJO models1 version - Latest release: 6 months ago - 39 downloads last month - 6,710 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
15 versions - Latest release: about 4 years ago - 7 dependent packages - 71 dependent repositories - 290 thousand downloads last month - 82 stars on GitHub - 1 maintainer
pytest-spark 0.6.0
pytest plugin to run the tests with support of pyspark.15 versions - Latest release: about 4 years ago - 7 dependent packages - 71 dependent repositories - 290 thousand downloads last month - 82 stars on GitHub - 1 maintainer
Top 0.1% on pypi.org
44 versions - Latest release: 3 months ago - 588 dependent packages - 6,227 dependent repositories - 29 million downloads last month - 38,255 stars on GitHub - 1 maintainer
pyspark 3.5.1
Apache Spark Python API44 versions - Latest release: 3 months ago - 588 dependent packages - 6,227 dependent repositories - 29 million downloads last month - 38,255 stars on GitHub - 1 maintainer
pydoris 1.0.5
Python interface to Doris8 versions - Latest release: 3 months ago - 2 dependent packages - 24.2 thousand downloads last month - 10,473 stars on GitHub - 1 maintainer
pydantic-spark 1.0.1
Converting pydantic classes to spark schemas6 versions - Latest release: 6 months ago - 2 dependent packages - 1 dependent repositories - 59.3 thousand downloads last month - 22 stars on GitHub - 1 maintainer
dbl-waterbear 0.1.1
Automated provisioning of an industry Lakehouse with enterprise data model2 versions - Latest release: about 2 years ago - 1 dependent repositories - 12 downloads last month - 9 stars on GitHub - 1 maintainer
lftakakura-mage-ai 0.9.37a1
Mage is a tool for building and deploying data pipelines.1 version - Latest release: 7 months ago - 15 downloads last month - 6,086 stars on GitHub - 1 maintainer
cz-sqlglot 0.0.1
An easily customizable SQL parser and transpiler1 version - Latest release: 8 months ago - 26 downloads last month - 4,253 stars on GitHub - 1 maintainer
pyspark-connectby 1.1.3
connectby hierarchy query in spark6 versions - Latest release: 2 months ago - 656 downloads last month - 38,255 stars on GitHub - 1 maintainer
Top 7.5% on pypi.org
41 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 2.82 thousand downloads last month - 607 stars on GitHub - 4 maintainers
jupyter-enterprise-gateway 3.2.3
A web server for spawning and communicating with remote Jupyter kernels41 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 2.82 thousand downloads last month - 607 stars on GitHub - 4 maintainers
Top 0.7% on pypi.org
73 versions - Latest release: 11 months ago - 13 dependent packages - 327 dependent repositories - 66.2 thousand downloads last month - 13,954 stars on GitHub - 2 maintainers
horovod 0.28.1
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.73 versions - Latest release: 11 months ago - 13 dependent packages - 327 dependent repositories - 66.2 thousand downloads last month - 13,954 stars on GitHub - 2 maintainers
Top 1.7% on pypi.org
53 versions - Latest release: 8 months ago - 3 dependent packages - 92 dependent repositories - 93.5 thousand downloads last month - 1,286 stars on GitHub - 4 maintainers
hdijupyterutils 0.21.0
HdiJupyterUtils: Utils for Jupyter projects from HDInsight team53 versions - Latest release: 8 months ago - 3 dependent packages - 92 dependent repositories - 93.5 thousand downloads last month - 1,286 stars on GitHub - 4 maintainers
Top 0.7% on pypi.org
113 versions - Latest release: 6 days ago - 14 dependent packages - 393 dependent repositories - 323 thousand downloads last month - 6,710 stars on GitHub - 2 maintainers
h2o 3.46.0.2
H2O, Fast Scalable Machine Learning, for python113 versions - Latest release: 6 days ago - 14 dependent packages - 393 dependent repositories - 323 thousand downloads last month - 6,710 stars on GitHub - 2 maintainers
onetl 0.10.2
One ETL tool to rule them all14 versions - Latest release: about 2 months ago - 797 downloads last month - 58 stars on GitHub - 2 maintainers
Related Keywords
python
182
pyspark
113
scala
79
tensorflow
69
machine-learning
67
pytorch
59
big-data
57
pandas
53
data-science
48
NLP
44
distributed-deep-learning
44
development
41
transformers
39
data
36
sql
36
machine learning
34
distributed
34
databricks
31
keras
30
etl
28
llm
26
automl
25
apache-spark
25
analytics-zoo
25
bigdl
25
parallel
24
big data
24
h2o
23
hacktoberfest
23
modeling
23
jupyter
22
data mining
22
statistical analysis
22
java
21
aws
21
hadoop
21
deep-learning
21
dataframe
20
rsparkling
20
pysparkling
20
integration
20
neural-architecture-search
19
data-engineering
19
NLU
19
mlops
18
kubernetes
18
bigdata
17
hive
16
data-analysis
15
sdk
14
scikit-learn
14
dask
14
entity-resolution
13
Spark
13
text-classification
12
seq2seq
12
sentiment-analysis
12
azure
12
named-entity-recognition
12
lemmatizer
12
language-detection
12
spell-checker
12
pypi
12
python3
12
extensible
12
bigquery
12
automation
12
jupyter-notebook
11
streamlit
11
schema
11
snowflake
11
t5
11
workflows
11
r
11
analytics
11
flyte
11
flyte-tasks
11
text-summarization
10
text-translation
10
sentiment-classifier
10
sentence-embeddings
10
nlu
10
natural-language-understanding
10
hdfs
10
dependency-parsing
10
bert-embedding
10
redshift
10
scoring
10
artificial-intelligence
10
deep learning
9
ai
9
docker
9
api
9
gpu
9
delta-lake
9
kafka
9
jdbc
9
clickhouse
9
postgres
9
distributed-computing
9