Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "spark" keyword

h2o-pysparkling-scoring-2.1 3.32.1.7-1
Sparkling Water integrates H2O's Fast Scalable Machine Learning with Spark
7 versions - Latest release: almost 3 years ago - 13 downloads last month - 953 stars on GitHub - 1 maintainer
Top 9.2% on pypi.org
h2o-pysparkling-2.3 3.26.11
Sparkling Water integrates H2O's Fast Scalable Machine Learning with Spark
102 versions - Latest release: over 4 years ago - 407 downloads last month - 953 stars on GitHub - 1 maintainer
bigdl-llm 2.4.0
Large Language Model Develop Toolkit
349 versions - Latest release: 7 months ago - 21.6 thousand downloads last month - 4,693 stars on GitHub - 1 maintainer
sparkdantic 0.20.5
A pydantic -> spark schema library
36 versions - Latest release: 3 months ago - 34.2 thousand downloads last month - 270 stars on GitHub - 1 maintainer
ng_ai 0.2.8
NebulaGraph AI Suite
9 versions - Latest release: about 1 year ago - 44 downloads last month - 62 stars on GitHub - 2 maintainers
delta-sync 0.0.1
Syncing delta tables
2 versions - Latest release: 9 months ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.8% on pypi.org
spyrk 0.0.4
Python module for Spark devices
3 versions - Latest release: about 8 years ago - 6 dependent repositories - 32 downloads last month - 35 stars on GitHub - 1 maintainer
emr-serverless-customauth
EMR Serverless Custom Authenticator for spark magic kernel.
1 version - 175 downloads last month - 1,289 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
hdijupyterutils 0.21.0
HdiJupyterUtils: Utils for Jupyter projects from HDInsight team
53 versions - Latest release: 9 months ago - 3 dependent packages - 92 dependent repositories - 76.2 thousand downloads last month - 1,289 stars on GitHub - 4 maintainers
Top 1.9% on pypi.org
autovizwidget 0.21.0
AutoVizWidget: An Auto-Visualization library for pandas dataframes
54 versions - Latest release: 9 months ago - 4 dependent packages - 93 dependent repositories - 75.6 thousand downloads last month - 1,289 stars on GitHub - 4 maintainers
Top 1.7% on pypi.org
sparkmagic 0.21.0
SparkMagic: Spark execution via Livy
56 versions - Latest release: 9 months ago - 4 dependent packages - 86 dependent repositories - 46.3 thousand downloads last month - 1,273 stars on GitHub - 5 maintainers
dora-parser 0.1.3
SQL Parser ans Transpiler
9 versions - Latest release: over 2 years ago - 1 dependent repositories - 25 downloads last month - 1 maintainer
pydoris-client 1.0.4
Python interface to Doris
3 versions - Latest release: 7 months ago - 45 downloads last month - 11,627 stars on GitHub - 1 maintainer
dbt-doris 0.3.4
The doris adapter plugin for dbt
8 versions - Latest release: 9 months ago - 1 dependent repositories - 133 downloads last month - 10,473 stars on GitHub - 1 maintainer
pydoris 1.0.5
Python interface to Doris
8 versions - Latest release: 4 months ago - 2 dependent packages - 26.5 thousand downloads last month - 10,473 stars on GitHub - 1 maintainer
Top 9.8% on pypi.org
jupyterlab-sql-editor 0.1.96
SQL editor support for formatting, syntax highlighting and code completion of SQL in cell magic, ...
94 versions - Latest release: about 1 month ago - 1 dependent repositories - 1.08 thousand downloads last month - 81 stars on GitHub - 1 maintainer
pyspark-data-sources 0.1.2
Custom Spark data sources for reading and writing data in Apache Spark, using the Python Data Sou...
4 versions - Latest release: 4 months ago - 81 downloads last month - 38,255 stars on GitHub - 1 maintainer
qogir 2021.9.16
Qogir: Make task easier
6 versions - Latest release: over 2 years ago - 1 dependent repositories - 48 downloads last month - 1 maintainer
Top 7.2% on pypi.org
sparkdl 0.2.2
Integration tools for running deep learning on Spark
1 version - Latest release: about 6 years ago - 5 dependent repositories - 20.3 thousand downloads last month - 58 stars on GitHub - 1 maintainer
codeme 0.1.9
CodeMe - Automatic Python Coder
20 versions - Latest release: over 1 year ago - 140 downloads last month - 1 stars on GitHub - 1 maintainer
analytics-command-center 3.0.14
Command Center for Data Ingestion, Advanced Analytics and Artificial Intelligence process
1 version - Latest release: over 2 years ago - 14 downloads last month - 11 stars on GitHub - 1 maintainer
marshmallow-pyspark 0.2.4
PySpark data serializer
6 versions - Latest release: 5 months ago - 1 dependent repositories - 2.14 thousand downloads last month - 12 stars on GitHub - 1 maintainer
pyspark-bucketmap 0.0.5
Easily group pyspark data into buckets and map them to different values.
4 versions - Latest release: over 1 year ago - 25 downloads last month - 1 stars on GitHub - 1 maintainer
Top 4.3% on pypi.org
sagemaker-pyspark 1.4.5
Amazon SageMaker PySpark Bindings
36 versions - Latest release: almost 2 years ago - 36 dependent repositories - 86.2 thousand downloads last month - 297 stars on GitHub - 1 maintainer
sparkpickle 1.0.1
Provides functions for reading SequenceFile-s with Python pickles.
2 versions - Latest release: over 7 years ago - 1 dependent repositories - 4.89 thousand downloads last month - 24 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
spark-nlp 5.3.3
John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML...
141 versions - Latest release: 2 months ago - 35 dependent packages - 35 dependent repositories - 4.17 million downloads last month - 3,717 stars on GitHub - 3 maintainers
pramen-py 1.8.8
Pramen transformations written in python
32 versions - Latest release: 25 days ago - 400 downloads last month - 22 stars on GitHub - 3 maintainers
tfspark 1.3.5
Hops Hadoop version of TensorFlow on Spark (not for Apache Hadoop), leverages GPU scheduling
19 versions - Latest release: about 6 years ago - 1 dependent repositories - 86 downloads last month - 4 stars on GitHub - 2 maintainers
pagaya-mapinpandas 0.5
Easy python wrapper for Spark mapInPandas, applyInPandas
1 version - Latest release: almost 2 years ago - 13 downloads last month - 1 maintainer
spark-quality-rules-tools 0.9.10
spark_quality_rules_tools
61 versions - Latest release: 2 months ago - 79 downloads last month - 1 maintainer
markdown-frames 1.0.6
Markdown tables parsing to pyspark / pandas DataFrames
7 versions - Latest release: over 1 year ago - 1 dependent repositories - 22.1 thousand downloads last month - 3 stars on GitHub - 3 maintainers
zoe-analytics 0.8.1
Zoe - Analytics on demand
2 versions - Latest release: over 8 years ago - 5 dependent repositories - 16 downloads last month - 51 stars on GitHub - 1 maintainer
pytispark 1.0.1
TiSpark support for python
8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 39 downloads last month - 877 stars on GitHub - 3 maintainers
Top 0.7% on pypi.org
horovod 0.28.1
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
73 versions - Latest release: 12 months ago - 13 dependent packages - 327 dependent repositories - 90.1 thousand downloads last month - 14,006 stars on GitHub - 2 maintainers
spark-scaffolder-transforms-tools 0.0.1
spark_scaffolder_transforms_tools
1 version - Latest release: 2 months ago - 173 downloads last month - 1 maintainer
spark-doc-zh 2020.9.21.0 removed
<center>Apache Spark 官方文档中文版</center>
1 version - Latest release: over 3 years ago - 1,182 stars on GitHub
husqvarna-getl 3.4.0
An elegant way to ETL'ing
51 versions - Latest release: 4 months ago - 1 dependent repositories - 392 downloads last month - 9 stars on GitHub - 1 maintainer
spark-nlp-models 0.0.1
spark-nlp-models package
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 19 downloads last month - 3 maintainers
Top 3.9% on pypi.org
sparknlp 1.0.0
John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML...
2 versions - Latest release: over 3 years ago - 1 dependent package - 7 dependent repositories - 54.9 thousand downloads last month - 3 maintainers
Top 1.7% on pypi.org
databricks-connect 14.3.2
Databricks Connect Client
216 versions - Latest release: about 1 month ago - 15 dependent packages - 64 dependent repositories - 1.02 million downloads last month - 37,738 stars on GitHub - 19 maintainers
pysparkutils 0.2.5
A collection of utilities for handling pySpark's SparkContext
14 versions - Latest release: almost 7 years ago - 1 dependent repositories - 39 downloads last month - 2 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
spark-df-profiling 1.1.13
Create HTML profiling reports from Apache Spark DataFrames
13 versions - Latest release: almost 8 years ago - 2 dependent repositories - 34.2 thousand downloads last month - 194 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
impyla 0.19.0
Python client for the Impala distributed query engine
52 versions - Latest release: 7 months ago - 29 dependent packages - 251 dependent repositories - 746 thousand downloads last month - 724 stars on GitHub - 13 maintainers
spark-datax-tools 0.6.6
spark_datax_tools
19 versions - Latest release: 3 months ago - 91 downloads last month - 1 maintainer
spark-rest-api 0.0.0
Async wrapper for Spark REST API
1 version - Latest release: 11 months ago - 12 downloads last month - 2 maintainers
Top 8.5% on pypi.org
flytekitplugins-dask 1.12.0
Dask plugin for flytekit
74 versions - Latest release: about 1 month ago - 1 dependent repositories - 2.85 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
flytekitplugins-whylogs 1.12.0
Enable the use of whylogs profiles to be used in flyte tasks to get aggregate statistics about data.
109 versions - Latest release: about 1 month ago - 1 dependent repositories - 4.03 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-identity-aware-proxy 1.12.0
External command plugin to generate ID tokens for GCP Identity Aware Proxy
36 versions - Latest release: about 1 month ago - 4.7 thousand downloads last month - 200 stars on GitHub - 1 maintainer
flytekitplugins-pydantic 1.12.0
Plugin adding type support for Pydantic models
37 versions - Latest release: about 1 month ago - 2.58 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 8.4% on pypi.org
flytekitplugins-data-fsspec 1.12.0
This is a deprecated plugin as of flytekit 1.5
206 versions - Latest release: about 1 month ago - 1 dependent repositories - 2.49 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 2.4% on pypi.org
flytekit 1.12.0
Flyte SDK for Python
381 versions - Latest release: about 1 month ago - 53 dependent packages - 41 dependent repositories - 267 thousand downloads last month - 200 stars on GitHub - 7 maintainers
Top 5.2% on pypi.org
flytekitplugins-deck-standard 1.12.0
This Plugin provides more renderers to improve task visibility
146 versions - Latest release: about 1 month ago - 3 dependent repositories - 19.7 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
flytekitplugins-huggingface 1.12.0
Hugging Face plugin for flytekit
101 versions - Latest release: about 1 month ago - 1 dependent repositories - 8.11 thousand downloads last month - 200 stars on GitHub - 1 maintainer
Top 8.9% on pypi.org
flytekitplugins-dbt 1.12.0
DBT Plugin for Flytekit
101 versions - Latest release: about 1 month ago - 1 dependent repositories - 2.66 thousand downloads last month - 200 stars on GitHub - 1 maintainer
autograder3 3.0.0
A testing helper for Apache Spark Databricks notebooks
1 version - Latest release: over 4 years ago - 1 dependent repositories - 7 downloads last month - 1 maintainer
yaetos 0.11.5
Write data & AI pipelines in (SQL, Spark, Pandas) and deploy them to the cloud, simplified
44 versions - Latest release: 3 months ago - 1 dependent repositories - 123 downloads last month - 32 stars on GitHub - 1 maintainer
sourced-jgit-spark-connector 2.0.1
Engine to use Spark on top of source code repositories.
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 26 downloads last month - 71 stars on GitHub - 1 maintainer
pyspark-ds-toolbox 0.4.3
A Pyspark companion for data science tasks.
20 versions - Latest release: almost 2 years ago - 1 dependent repositories - 44 downloads last month - 2 stars on GitHub - 1 maintainer
tidypyspark 0.0.1
dplyr for pyspark
1 version - Latest release: about 1 year ago - 17 downloads last month - 14 stars on GitHub - 2 maintainers
tfservingspark 0.1.0
Tensorflow serving on spark dataframe
1 version - Latest release: about 6 years ago - 1 dependent repositories - 10 downloads last month - 7 stars on GitHub - 1 maintainer
pypandas 0.2.5
A data cleaning framework for Spark
7 versions - Latest release: about 6 years ago - 3 dependent repositories - 273 downloads last month - 7 stars on GitHub - 1 maintainer
metaspore 1.1.0
Metaspore: A Unified End-to-end Machine Intelligence Platform
3 versions - Latest release: over 1 year ago - 17 downloads last month - 629 stars on GitHub - 2 maintainers
Top 1.2% on pypi.org
koalas 1.8.2
Koalas: pandas API on Apache Spark
47 versions - Latest release: over 2 years ago - 11 dependent packages - 444 dependent repositories - 2.25 million downloads last month - 3,308 stars on GitHub - 7 maintainers
Top 9.9% on pypi.org
soda-spark 0.3.3
Soda SQL API for PySpark data frame
11 versions - Latest release: about 2 years ago - 1 dependent package - 1 dependent repositories - 45.7 thousand downloads last month - 61 stars on GitHub - 1 maintainer
starlake-airflow 0.1.2
Starlake Python Distribution For Airflow
24 versions - Latest release: 4 months ago - 1 dependent package - 128 downloads last month - 31 stars on GitHub - 1 maintainer
starlake-dagster 0.1.2
Starlake Python Distribution For Dagster
6 versions - Latest release: 4 months ago - 1 dependent package - 85 downloads last month - 31 stars on GitHub - 1 maintainer
starlake-orchestration 0.1.2
Starlake Python Distribution For orchestration
7 versions - Latest release: 4 months ago - 2 dependent packages - 121 downloads last month - 31 stars on GitHub - 1 maintainer
spark-partition-server 0.1.5
Simple Python components for launching and managing servers on a running Spark cluster
3 versions - Latest release: almost 8 years ago - 2 dependent repositories - 12 downloads last month - 2 stars on GitHub - 1 maintainer
ydot 0.0.6 💰
R-like formulas for Spark Dataframes
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 37 downloads last month - 10 stars on GitHub - 1 maintainer
exelog 0.0.1
Enabling meticulous logging for Spark Applications
1 version - Latest release: over 2 years ago - 1 dependent repositories - 14 downloads last month - 5 stars on GitHub - 1 maintainer
spark-datax-schema-tools 0.0.43
spark_datax_schema_tools
15 versions - Latest release: over 1 year ago - 1 dependent repositories - 84 downloads last month - 1 maintainer
pycebes 0.10.2
Python client for Cebes HTTP server.
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 13 downloads last month - 2 stars on GitHub - 1 maintainer
lehar 0.4
Visualize data using relative ordering
4 versions - Latest release: almost 7 years ago - 1 dependent repositories - 30 downloads last month - 78 stars on GitHub - 1 maintainer
typedspark 1.4.2
Column-wise type annotations for pyspark DataFrames
39 versions - Latest release: about 1 month ago - 1 dependent package - 18 thousand downloads last month - 52 stars on GitHub - 1 maintainer
hadoop-fs-wrapper 0.6.1
Python Wrapper for Hadoop Java API
8 versions - Latest release: 11 months ago - 1 dependent package - 1 dependent repositories - 1.94 thousand downloads last month - 3 stars on GitHub - 1 maintainer
spark-celery 0.1.1
A helper to allow Python Celery tasks to do work in a Spark job
3 versions - Latest release: over 6 years ago - 1 dependent repositories - 360 downloads last month - 27 stars on GitHub - 1 maintainer
pyspark-spy 1.0.2
Collect and aggregate on spark events for profitz. In 🐍 way!
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 207 downloads last month - 10 stars on GitHub - 1 maintainer
e2eaiok 1.2.0
Intel® End-to-End AI Optimization Kit
174 versions - Latest release: 6 months ago - 403 downloads last month - 31 stars on GitHub - 1 maintainer
deltatuner 1.2.0
Intel extension for peft with PyTorch and DENAS
18 versions - Latest release: 6 months ago - 3.5 thousand downloads last month - 28 stars on GitHub - 2 maintainers
e2eaiok-deltatuner 1.2.0
Intel extension for peft with PyTorch and DENAS
3 versions - Latest release: 6 months ago - 21 downloads last month - 31 stars on GitHub - 2 maintainers
e2eaiok-modeladapter12 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 3 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-recdp 1.2.0
A data processing bundle for spark based recommender system operations
2 versions - Latest release: 6 months ago - 19 downloads last month - 31 stars on GitHub - 1 maintainer
e2eaiok-modeladapter9 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 5 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-sda 1.1.0
Intel® End-to-End AI Optimization Kit
54 versions - Latest release: about 1 year ago - 84 downloads last month - 28 stars on GitHub - 1 maintainer
e2eaiok-modeladapter10 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 4 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter11 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 5 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter 1.1.0
Intel® End-to-End AI Optimization Kit
18 versions - Latest release: about 1 year ago - 50 downloads last month - 28 stars on GitHub - 1 maintainer
e2eaiok-modeladapter7 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 3 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter8 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 4 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter6 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 3 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-denas 1.1.0
Intel® End-to-End AI Optimization Kit
57 versions - Latest release: about 1 year ago - 76 downloads last month - 28 stars on GitHub - 1 maintainer
e2eaiok-modeladapter3 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 3 downloads last month - 24 stars on GitHub - 1 maintainer
pyrecdp 1.2.0
A data processing bundle for spark based recommender system operations
74 versions - Latest release: 6 months ago - 1 dependent repositories - 189 downloads last month - 28 stars on GitHub - 2 maintainers
e2eaiok-modeladapter2 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 4 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter5 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 5 downloads last month - 24 stars on GitHub - 1 maintainer
e2eaiok-modeladapter1 1.0.0 removed
Intel® End-to-End AI Optimization Kit
1 version - Latest release: about 1 year ago - 18 stars on GitHub
reina 0.0.6
A Causal Inference library for Big Data.
6 versions - Latest release: almost 3 years ago - 1 dependent repositories - 17 downloads last month - 10 stars on GitHub - 1 maintainer
Top 0.1% on pypi.org
pyspark 3.5.1
Apache Spark Python API
45 versions - Latest release: 4 months ago - 588 dependent packages - 6,227 dependent repositories - 28.8 million downloads last month - 38,255 stars on GitHub - 1 maintainer
pysparql 0.0.6
Query a SPARQL endpoint and manage the result with Spark
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 45 downloads last month - 0 stars on GitHub - 1 maintainer
sparkler 0.3.0
GitHub stats sparkline CLI
5 versions - Latest release: almost 5 years ago - 1 dependent repositories - 28 downloads last month - 1 maintainer