Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "apache-spark" keyword

dead-salmon-brain 0.0.7
Dead Salmon Brain is a cluster computing system for analysing A/B experiments
7 versions - Latest release: almost 2 years ago - 1 dependent repositories - 60 downloads last month - 11 stars on GitHub - 2 maintainers
Top 3.8% on pypi.org
pysparkling 0.6.2
Pure Python implementation of the Spark RDD interface.
69 versions - Latest release: over 1 year ago - 1 dependent package - 34 dependent repositories - 11.8 thousand downloads last month - 261 stars on GitHub - 2 maintainers
Top 0.6% on pypi.org
mlflow-skinny 2.12.1
MLflow is an open source platform for the complete machine learning lifecycle
56 versions - Latest release: 14 days ago - 35 dependent packages - 70 dependent repositories - 4.44 million downloads last month - 17,301 stars on GitHub - 15 maintainers
mlflow-databricks-artifacts 2.0.1
Plugin to create and access MLflow-managed artifacts on Databricks
3 versions - Latest release: over 1 year ago - 1 dependent repositories - 2.19 thousand downloads last month - 17,301 stars on GitHub - 4 maintainers
Top 0.2% on pypi.org
mlflow 2.12.1
MLflow is an open source platform for the complete machine learning lifecycle
95 versions - Latest release: 14 days ago - 244 dependent packages - 5,089 dependent repositories - 12.9 million downloads last month - 17,301 stars on GitHub - 19 maintainers
mlflow-tmp 2.2.26
MLflow: A Platform for ML Development and Productionization
25 versions - Latest release: 11 months ago - 136 downloads last month - 17,301 stars on GitHub - 2 maintainers
mlflow-by-johnsnowlabs 2.40.0
MLflow: A Platform for ML Development and Productionization
35 versions - Latest release: 7 months ago - 183 downloads last month - 16,113 stars on GitHub - 2 maintainers
mlflow-devlibx 1.22.8
MLflow: A Platform for ML Development and Productionization
9 versions - Latest release: over 2 years ago - 1 dependent repositories - 62 downloads last month - 17,288 stars on GitHub - 1 maintainer
mlflow-stonewise 1.30.1
MLflow: A Platform for ML Development and Productionization
1 version - Latest release: over 1 year ago - 20 downloads last month - 17,288 stars on GitHub - 1 maintainer
mlflowcollab 0.0.4
Gebruik MLFlow op een centrale locatie
1 version - Latest release: about 2 years ago - 1 dependent repositories - 13 downloads last month - 17,288 stars on GitHub - 1 maintainer
aim-mlflow 0.2.1
Aim-MLflow integration
4 versions - Latest release: about 1 year ago - 1 dependent repositories - 340 downloads last month - 17,288 stars on GitHub - 2 maintainers
mlflow-by-ckl 2.67.0
MLflow: A Platform for ML Development and Productionization
32 versions - Latest release: 15 days ago - 354 downloads last month - 16,113 stars on GitHub - 2 maintainers
blind 0.0.1
Blind Client: The easiest ML tracking library
1 version - Latest release: over 2 years ago - 2 dependent repositories - 33 downloads last month - 17,288 stars on GitHub - 2 maintainers
mlflow-ste 1.10.1.dev0
MLflow: An ML Workflow Tool
1 version - Latest release: over 3 years ago - 1 dependent repositories - 20 downloads last month - 16,107 stars on GitHub - 1 maintainer
qubole-ml 1.9.1
MLflow: An ML Workflow Tool
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 16 downloads last month - 17,274 stars on GitHub - 1 maintainer
mlflow-by-johnsnowlabs-v2 2.44.0
MLflow: A Platform for ML Development and Productionization
9 versions - Latest release: 8 months ago - 50 downloads last month - 17,280 stars on GitHub - 2 maintainers
mlflow-saagie 2.9.2
MLflow: A Platform for ML Development and Productionization - forked for Saagie
8 versions - Latest release: 3 months ago - 1 dependent repositories - 33 downloads last month - 17,280 stars on GitHub - 1 maintainer
lmcmlflow 1.17.1
MLflow: A Platform for ML Development and Productionization
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 17 downloads last month - 17,274 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
synapseml 1.0.4
Synapse Machine Learning
16 versions - Latest release: 20 days ago - 2 dependent packages - 3 dependent repositories - 179 thousand downloads last month - 4,969 stars on GitHub - 2 maintainers
patek 0.5.2
A collection of utilities and tools for accelerating pyspark development and productivity.
7 versions - Latest release: about 1 year ago - 239 downloads last month - 0 stars on GitHub - 1 maintainer
pyspark3d 0.3.1
Spark extension for processing large-scale 3D data sets
4 versions - Latest release: over 5 years ago - 1 dependent repositories - 75 downloads last month - 29 stars on GitHub - 2 maintainers
pyjaws 0.1.7
PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows
10 versions - Latest release: 7 months ago - 1 dependent repositories - 60 downloads last month - 37 stars on GitHub - 2 maintainers
Top 5.0% on pypi.org
sparkit-learn 0.2.6
Scikit-learn on PySpark
5 versions - Latest release: almost 9 years ago - 5 dependent repositories - 4.33 thousand downloads last month - 1,147 stars on GitHub - 6 maintainers
mmtfpyspark 0.3.6
Methods for parallel and distributed analysis and mining of the Protein Data Bank using MMTF and ...
10 versions - Latest release: over 5 years ago - 1 dependent repositories - 76 downloads last month - 67 stars on GitHub - 4 maintainers
svm 0.1.0
Version manager for Apache Spark
2 versions - Latest release: over 6 years ago - 14 dependent repositories - 1.21 thousand downloads last month - 0 stars on GitHub - 2 maintainers
Top 6.9% on pypi.org
sparkmeasure 0.24.0
Python API for sparkMeasure, a tool for performance troubleshooting of Apache Spark workloads.
12 versions - Latest release: about 2 months ago - 1 dependent repositories - 528 thousand downloads last month - 642 stars on GitHub - 2 maintainers
sparkql 0.10.0
sparkql: Apache Spark SQL DataFrame schema management for sensible humans
20 versions - Latest release: 8 months ago - 1 dependent repositories - 27.9 thousand downloads last month - 13 stars on GitHub - 2 maintainers
tpcds-pyspark 1.0.5
TPCDS_PySpark is a TPC-DS workload generator implemented in Python designed to run at scale using...
5 versions - Latest release: about 2 months ago - 96 downloads last month - 400 stars on GitHub - 2 maintainers
test-cpu-parallel 1.0.5
test-CPU-parallel is a basic CPU workload generator.
2 versions - Latest release: 8 months ago - 27 downloads last month - 400 stars on GitHub - 2 maintainers
sparkhistogram 0.3
Sparkhistogram contains helper functions for generating data histograms with the Spark DataFrame ...
3 versions - Latest release: almost 2 years ago - 1 dependent repositories - 73 downloads last month - 400 stars on GitHub - 1 maintainer
Top 3.1% on pypi.org
spark-sklearn 0.3.0
Integration tools for running scikit-learn on Spark
8 versions - Latest release: about 5 years ago - 14 dependent repositories - 721 thousand downloads last month - 1,075 stars on GitHub - 10 maintainers
google-dataproc-templates 0.1.0
Google Dataproc templates written in Python
8 versions - Latest release: about 1 year ago - 21.9 thousand downloads last month - 111 stars on GitHub - 3 maintainers
Top 6.8% on pypi.org
dataproc-templates 0.0.2 removed
Dataproc templates written in Python
2 versions - Latest release: over 1 year ago - 40 stars on GitHub
cleanflow 1.3.3a1
A a framework for cleaning, pre-processing and exploring data in a scalable and distributed manner.
11 versions - Latest release: almost 6 years ago - 31 downloads last month - 1 stars on GitHub - 2 maintainers
pybda 0.1.0
Analysis of big biological data sets for distributed HPC clusters.
6 versions - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 9 stars on GitHub - 2 maintainers
nozberkman-mmlspark 1.0.0
Microsoft ML for Spark
1 version - Latest release: over 2 years ago - 1 dependent repositories - 14 downloads last month - 4,472 stars on GitHub - 2 maintainers
Top 7.5% on pypi.org
feathr 1.0.0
An Enterprise-Grade, High Performance Feature Store
22 versions - Latest release: about 1 year ago - 1 dependent repositories - 1.8 thousand downloads last month - 1,928 stars on GitHub - 1 maintainer
Top 7.6% on pypi.org
sparktorch 0.2.0
Distributed training of PyTorch networks on Apache Spark with ML Pipeline support
11 versions - Latest release: about 1 year ago - 2 dependent repositories - 668 downloads last month - 333 stars on GitHub - 2 maintainers
fasttrackml 0.5.1
An experiment tracking server focused on speed and scalability
21 versions - Latest release: 26 days ago - 982 downloads last month - 93 stars on GitHub - 2 maintainers
lakefs 0.6.0
lakeFS Python SDK Wrapper
15 versions - Latest release: 14 days ago - 12.3 thousand downloads last month - 4,053 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
lakefs-sdk 1.20.0
lakeFS API
34 versions - Latest release: 14 days ago - 3 dependent packages - 1 dependent repositories - 17.6 thousand downloads last month - 4,054 stars on GitHub - 2 maintainers
Top 3.2% on pypi.org
lakefs-client 1.20.0
lakeFS API
150 versions - Latest release: 14 days ago - 4 dependent packages - 5 dependent repositories - 15.4 thousand downloads last month - 4,053 stars on GitHub - 1 maintainer
sparkflow 0.7.0
Deep learning on Spark with Tensorflow
13 versions - Latest release: almost 5 years ago - 1 dependent repositories - 237 downloads last month - 300 stars on GitHub - 2 maintainers
livyc 0.0.14 💰
Apache Livy Client
11 versions - Latest release: almost 2 years ago - 15 downloads last month - 3 stars on GitHub - 2 maintainers
spark-privacy-preserver 0.3.1
Anonymizing Library for Apache Spark
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 14 downloads last month - 30 stars on GitHub - 6 maintainers
Top 9.0% on pypi.org
jupyterlab-sparkmonitor 4.1.0
Spark Monitor Extension for Jupyter Lab
14 versions - Latest release: over 2 years ago - 1 dependent repositories - 10.9 thousand downloads last month - 91 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
quinn 0.10.3
Pyspark helper methods to maximize developer efficiency
16 versions - Latest release: 3 months ago - 3 dependent packages - 11 dependent repositories - 800 thousand downloads last month - 574 stars on GitHub - 2 maintainers
parallel-simulations 0.0.1
Helper class to orchestrate in parallel Monte Carlo simulations for an arbitrary number of models...
1 version - Latest release: about 2 months ago - 36 downloads last month - 0 stars on GitHub - 2 maintainers
flintrock 2.1.0
A command-line tool for launching Apache Spark clusters.
14 versions - Latest release: 5 months ago - 1 dependent repositories - 252 downloads last month - 631 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
spylon 0.3.0
Utilities to work with Scala/Java code with py4j
19 versions - Latest release: almost 7 years ago - 16 dependent repositories - 3.24 thousand downloads last month - 40 stars on GitHub - 6 maintainers
sparkmanager 0.7.3
A pyspark management framework
21 versions - Latest release: about 4 years ago - 1 dependent repositories - 15 downloads last month - 0 stars on GitHub - 2 maintainers
pyspark-asyncactions 0.0.4 💰
A proof of concept asynchronous actions for PySpark using concurent.futures
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 4.24 thousand downloads last month - 44 stars on GitHub - 2 maintainers
sparkora 0.0.1
Exploratory data analysis toolkit for Pyspark
1 version - Latest release: over 2 years ago - 1 dependent repositories - 7 downloads last month - 53 stars on GitHub - 2 maintainers
chopin2 1.0.8
Supervised Classification with Hyperdimensional Computing
11 versions - Latest release: 2 months ago - 1 dependent repositories - 25 downloads last month - 8 stars on GitHub - 1 maintainer
gdmix-workflow 0.3.0
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
5 versions - Latest release: over 3 years ago - 1 dependent repositories - 9 downloads last month - 2,596 stars on GitHub - 2 maintainers
k8s-spark-helper-liangdao-data 0.0.3
Run spark task in k8s
3 versions - Latest release: about 1 year ago - 17 downloads last month - 2,596 stars on GitHub - 2 maintainers
dbnet 0.2.5
DbNet.
24 versions - Latest release: about 4 years ago - 1 dependent repositories - 33 downloads last month - 7 stars on GitHub - 2 maintainers
dcborow-mmlspark 0.14.dev1
Microsoft ML for Spark
1 version - Latest release: about 4 years ago - 1 dependent repositories - 79 downloads last month - 4,958 stars on GitHub - 2 maintainers
pyspark-connectors 0.2.0
The easy and quickly way to connect and integrate the Spark project with many others data sources.
8 versions - Latest release: almost 2 years ago - 156 downloads last month - 5 stars on GitHub - 2 maintainers
spark-pipeline 0.0.4
Data Science oriented tools, mostly for Apache Spark
2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 10 downloads last month - 9 stars on GitHub - 2 maintainers
pysparkgateway 0.0.22
Connect Pyspark to remote clusters
18 versions - Latest release: about 3 years ago - 1 dependent repositories - 17.9 thousand downloads last month - 3 stars on GitHub - 2 maintainers
pypair 3.0.9 💰
Pairwise association measures of statistical variable types
11 versions - Latest release: over 2 years ago - 1 dependent repositories - 954 downloads last month - 21 stars on GitHub - 2 maintainers
exelog 0.0.1
Enabling meticulous logging for Spark Applications
1 version - Latest release: over 2 years ago - 1 dependent repositories - 5 downloads last month - 5 stars on GitHub - 2 maintainers
pysparql 0.0.6
Query a SPARQL endpoint and manage the result with Spark
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 78 downloads last month - 0 stars on GitHub - 2 maintainers
Top 7.8% on pypi.org
dist-keras 0.2.1
Distributed Deep learning with Apache Spark with Keras.
3 versions - Latest release: over 6 years ago - 1 dependent repositories - 7.63 thousand downloads last month - 624 stars on GitHub - 2 maintainers
Top 3.7% on pypi.org
pyspark-stubs 3.0.0 💰
A collection of the Apache Spark stub files
38 versions - Latest release: almost 4 years ago - 2 dependent packages - 146 dependent repositories - 206 thousand downloads last month - 114 stars on GitHub - 2 maintainers
eleflow-spark-integrations 0.0.1a2 removed
The easy and quickly way to connect and integrate the Spark project with many others data sources.
2 versions - Latest release: almost 2 years ago - 6 stars on GitHub
jennytest 0.2.9 removed
Data quality and profiling tool powered by Apache Spark.
5 versions - Latest release: almost 4 years ago - 39 downloads last month