Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

conda-forge.org "spark" keyword

elephas 2.1.0 💰
Distributed Deep learning with Keras & Spark
4 versions - Latest release: almost 3 years ago - 1,562 stars on GitHub
delta-sharing-python 0.5.2
An open protocol for secure data sharing
6 versions - Latest release: over 1 year ago - 539 stars on GitHub
glow 1.0.1
Glow is an open-source toolkit for working with genomic data at biobank-scale and beyond. The too...
2 versions - Latest release: about 3 years ago - 3 dependent repositories - 225 stars on GitHub
datacompy 0.8.3
Pandas and Spark DataFrame comparison for humans
9 versions - Latest release: over 1 year ago - 1 dependent repositories - 269 stars on GitHub
sparkmagic 0.20.0
Jupyter magics and kernels for working with remote Spark clusters
17 versions - Latest release: almost 2 years ago - 2 dependent repositories - 1,207 stars on GitHub
hdijupyterutils 0.20.0
Jupyter magics and kernels for working with remote Spark clusters
10 versions - Latest release: almost 2 years ago - 4 dependent packages - 1 dependent repositories - 1,207 stars on GitHub
fugue 0.7.3
Fugue is a unified interface for distributed computing that lets users execute Python, pandas, a...
8 versions - Latest release: over 1 year ago - 4 dependent repositories - 1,271 stars on GitHub
splink 1.0.6
Fast, accurate and scalable probabilistic data linkage using your choice of SQL backend
16 versions - Latest release: almost 3 years ago - 571 stars on GitHub
autovizwidget 0.20.0
Jupyter magics and kernels for working with remote Spark clusters
12 versions - Latest release: almost 2 years ago - 3 dependent packages - 1 dependent repositories - 1,207 stars on GitHub
pyspark-test 0.2.0
Testing library for pyspark, inspired from pandas testing module but for pyspark, to help users w...
2 versions - Latest release: over 2 years ago - 14 stars on GitHub
r-sparkr 3.3.1
Apache Spark - A unified analytics engine for large-scale data processing
9 versions - Latest release: over 1 year ago - 37,996 stars on GitHub
mage-ai 0.7.5
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and t...
22 versions - Latest release: over 1 year ago - 3,631 stars on GitHub
sqlglot 10.0.6
Python SQL Parser and Transpiler
131 versions - Latest release: over 1 year ago - 2 dependent packages - 2,926 stars on GitHub
traceml 1.0.0
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for P...
1 version - Latest release: almost 2 years ago - 463 stars on GitHub
flytekit 1.1.0
Flytekit Python is the Python Library for easily authoring, testing, deploying, and interacting w...
4 versions - Latest release: almost 2 years ago - 8 dependent packages - 123 stars on GitHub
flytekitplugins-sqlalchemy 1.2.4
SQLAlchemy plugin for Flytekit: `flytekitplugins-sqlalchemy` PyPI: [https://pypi.org/project/fly...
8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-modin 1.2.4
Modin plugin for Flytekit: `flytekitplugins-modin` PyPI: [https://pypi.org/project/flytekitplugi...
9 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-athena 1.2.4
Athena plugin for Flytekit: `flytekitplugins-athena` PyPI: [https://pypi.org/project/flytekitplu...
8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-awsbatch 1.2.4
AWS Batch plugin for Flytekit: `flytekitplugins-awsbatch` PyPI: [https://pypi.org/project/flytek...
9 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-data-fsspec 1.2.4
`fsspec` powered data-plugins for Flytekit: `flytekitplugins-data-fsspec` PyPI: [https://pypi.or...
8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-spark 1.0.5
Spark 3 plugin for Flytekit: `flytekitplugins-spark` PyPI: [https://pypi.org/project/flytekitplu...
1 version - Latest release: over 1 year ago - 123 stars on GitHub
mleap 0.21.0
MLeap: Deploy ML Pipelines to Production
9 versions - Latest release: over 1 year ago - 1 dependent package - 1,443 stars on GitHub
pixiedust 1.1.19
Python Helper library for Jupyter Notebooks
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 1,029 stars on GitHub
tensorflowonspark 2.2.5
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
10 versions - Latest release: over 1 year ago - 3,851 stars on GitHub
sparkhpc 0.3.post4
launching and controlling spark on hpc clusters
1 version - Latest release: about 5 years ago - 19 stars on GitHub
jupyter_enterprise_gateway 3.0.0
Jupyter Enterprise Gateway A lightweight, multi-tenant, scalable and secure gateway that enables ...
23 versions - Latest release: over 1 year ago - 550 stars on GitHub
zappy 0.2.0
Distributed processing with NumPy and Zarr
2 versions - Latest release: about 5 years ago - 1 dependent repositories - 8 stars on GitHub
koalas 1.8.2
Koalas: pandas API on Apache Spark
42 versions - Latest release: over 2 years ago - 1 dependent package - 3,256 stars on GitHub
sagemaker_pyspark 1.4.2
A Spark library for Amazon SageMaker.
12 versions - Latest release: about 3 years ago - 274 stars on GitHub
flytekitplugins-pandera 1.2.4
Pandera plugin for Flytekit: `flytekitplugins-pandera` PyPI: [https://pypi.org/project/flytekitp...
7 versions - Latest release: over 1 year ago - 123 stars on GitHub
r-h2o 3.38.0.1
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gra...
11 versions - Latest release: over 1 year ago - 6,189 stars on GitHub
h2o-py 3.38.0.2
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gra...
31 versions - Latest release: over 1 year ago - 6,189 stars on GitHub
visions 0.7.5
Type System for Data Analysis in Python
12 versions - Latest release: over 2 years ago - 2 dependent packages - 13 dependent repositories - 174 stars on GitHub