Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "hadoop" keyword

impyla-jz 0.16.3
Python client for the Impala distributed query engine
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 20 downloads last month - 0 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
impyla 0.19.0
Python client for the Impala distributed query engine
52 versions - Latest release: 6 months ago - 29 dependent packages - 251 dependent repositories - 689 thousand downloads last month - 723 stars on GitHub - 13 maintainers
luigi-k8s-jobs-runner 2.8.10
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles depende...
11 versions - Latest release: about 4 years ago - 1 dependent repositories - 37 downloads last month - 17,382 stars on GitHub - 1 maintainer
spark-yarn-submit 1.0.0
library to handle spark job submit in a yarn cluster in different environment
1 version - Latest release: over 7 years ago - 1 dependent repositories - 12 downloads last month - 3 stars on GitHub - 1 maintainer
dbsync 0.1.1
Sync database to hadoop
1 version - Latest release: about 9 years ago - 1 dependent repositories - 23 downloads last month - 0 stars on GitHub - 1 maintainer
Top 8.7% on pypi.org
jumpy 0.2.4
Numpy and nd4j interop
6 versions - Latest release: over 5 years ago - 4 dependent repositories - 90 downloads last month - 13,453 stars on GitHub - 3 maintainers
Top 9.2% on pypi.org
madoop 1.2.2
A light weight MapReduce framework for education.
11 versions - Latest release: about 2 months ago - 2 dependent repositories - 275 downloads last month - 9 stars on GitHub - 1 maintainer
pydatavec 0.1.2
Python interface for DataVec
2 versions - Latest release: over 4 years ago - 1 dependent package - 1 dependent repositories - 37 downloads last month - 13,290 stars on GitHub - 1 maintainer
Top 0.5% on pypi.org
luigi 3.5.0
Workflow mgmgt + task scheduling + dependency resolution.
81 versions - Latest release: 4 months ago - 36 dependent packages - 586 dependent repositories - 336 thousand downloads last month - 17,072 stars on GitHub - 17 maintainers
hdfs-native 0.9.1
Python bindings for hdfs-native Rust library
11 versions - Latest release: about 1 month ago - 189 downloads last month - 18 stars on GitHub - 1 maintainer
knit 0.2.4 💰
Python wrapper for YARN Applications
6 versions - Latest release: over 6 years ago - 4 dependent repositories - 106 downloads last month - 53 stars on GitHub - 3 maintainers
hivejdbc 0.2.3
Hive database driver via jdbc
5 versions - Latest release: over 3 years ago - 1 dependent repositories - 73.5 thousand downloads last month - 5 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
starbase 0.3.3
Python client for HBase Stargate REST server
13 versions - Latest release: over 9 years ago - 14 dependent repositories - 3.67 thousand downloads last month - 53 stars on GitHub - 1 maintainer
pymrgeo 1.0.2
MrGeo (pronounced "Mister Geo") is an open source geospatial toolkit designed to provide raster-b...
3 versions - Latest release: almost 7 years ago - 1 dependent repositories - 23 downloads last month - 203 stars on GitHub - 1 maintainer
ambari-ldap-manager 0.7
A tool to manage Ambari users and groups when authentication uses LDAP.
5 versions - Latest release: almost 7 years ago - 1 dependent repositories - 67 downloads last month - 0 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
pydoop 2.0.0
Pydoop: a Python MapReduce and HDFS API for Hadoop
17 versions - Latest release: almost 5 years ago - 8 dependent repositories - 202 thousand downloads last month - 234 stars on GitHub - 4 maintainers
Top 9.8% on pypi.org
pinball 0.2.12
Workflow manager and scheduler
13 versions - Latest release: about 8 years ago - 4 dependent repositories - 78 downloads last month - 1,047 stars on GitHub - 1 maintainer
Top 3.4% on pypi.org
yarn-api-client 1.0.3
Python client for Hadoop® YARN API
21 versions - Latest release: over 2 years ago - 5 dependent packages - 19 dependent repositories - 660 thousand downloads last month - 109 stars on GitHub - 4 maintainers
h2o-mlflow-flavor 0.1.0
A mlflow flavor for working with H2O-3 MOJO and POJO models
1 version - Latest release: 6 months ago - 39 downloads last month - 6,710 stars on GitHub - 1 maintainer
Top 4.4% on pypi.org
nflx-genie-client 3.6.17
Genie Python Client.
102 versions - Latest release: 10 months ago - 2 dependent repositories - 80.4 thousand downloads last month - 1,679 stars on GitHub - 3 maintainers
Top 6.1% on pypi.org
pyhdfs 0.3.1
Pure Python HDFS client
7 versions - Latest release: over 4 years ago - 2 dependent packages - 11 dependent repositories - 7.36 thousand downloads last month - 88 stars on GitHub - 1 maintainer
pydoris 1.0.5
Python interface to Doris
8 versions - Latest release: 3 months ago - 2 dependent packages - 24.2 thousand downloads last month - 10,473 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
h2o 3.46.0.2
H2O, Fast Scalable Machine Learning, for python
113 versions - Latest release: 6 days ago - 14 dependent packages - 393 dependent repositories - 323 thousand downloads last month - 6,710 stars on GitHub - 2 maintainers
dvc-hdfs 3.0.0
hdfs plugin for dvc
3 versions - Latest release: 5 months ago - 5 dependent packages - 3 dependent repositories - 10.7 thousand downloads last month - 0 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
dask-gateway 2024.1.0 💰
A client library for interacting with a dask-gateway server
22 versions - Latest release: 4 months ago - 13 dependent packages - 25 dependent repositories - 42.6 thousand downloads last month - 128 stars on GitHub - 4 maintainers
pydoris-client 1.0.4
Python interface to Doris
3 versions - Latest release: 6 months ago - 29 downloads last month - 11,272 stars on GitHub - 1 maintainer
hadoop-mapreduce 0.5
Implementation of Hadoop Mapreduce on text files
4 versions - Latest release: over 1 year ago - 23 downloads last month - 0 stars on GitHub - 1 maintainer
yarn-dev-tools 2.0.2
Various scripts to automate and ease Apache Hadoop YARN development.
19 versions - Latest release: about 2 months ago - 162 downloads last month - 2 stars on GitHub - 1 maintainer
dfspy 0.1.0
Distributed File System written in Python
1 version - Latest release: almost 2 years ago - 19 downloads last month - 14 stars on GitHub - 1 maintainer
splitlog 3.0.0
Utility to split aggregated logs from Apache Hadoop Yarn applications into a folder hierarchy
10 versions - Latest release: 7 months ago - 52 downloads last month - 0 stars on GitHub - 1 maintainer
ym-impyla 0.14.0
Python client for the Impala distributed query engine
1 version - Latest release: over 7 years ago - 1 dependent repositories - 11 downloads last month - 1 stars on GitHub - 1 maintainer
yarnlog 0.2.1
Download Apache Hadoop YARN log to your local machine.
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 35 downloads last month - 0 stars on GitHub - 1 maintainer
yarntf 0.0.3.dev3
Easy distributed TensorFlow on Hops Hadoop
3 versions - Latest release: about 7 years ago - 1 dependent repositories - 15 downloads last month - 33 stars on GitHub - 2 maintainers
webhdfspy 0.3.5
A wrapper library to access Hadoop HTTP REST API
8 versions - Latest release: over 7 years ago - 2 dependent repositories - 95 downloads last month - 8 stars on GitHub - 1 maintainer
trustedanalytics 0.7.3.post20161020785
Trusted Analytics Toolkit
161 versions - Latest release: over 7 years ago - 2 dependent repositories - 791 downloads last month - 43 stars on GitHub - 1 maintainer
tinyhdfs 1.1.4
Tiny client for HDFS, base on WebHDFS
1 version - Latest release: over 7 years ago - 1 dependent repositories - 11 downloads last month - 2 stars on GitHub - 1 maintainer
thumbor_hbase 0.11
HBase image storage for Thumbor
11 versions - Latest release: about 11 years ago - 2 dependent repositories - 25 downloads last month - 9 stars on GitHub - 1 maintainer
tf-yarn-gpu 0.6.3
Distributed TensorFlow on a YARN cluster with Gpu support
1 version - Latest release: over 2 years ago - 1 dependent repositories - 12 downloads last month - 86 stars on GitHub - 1 maintainer
tf-yarn 0.7.0
Distributed TensorFlow or pythorch on a YARN cluster
19 versions - Latest release: 7 months ago - 2 dependent repositories - 134 downloads last month - 86 stars on GitHub - 7 maintainers
streamsx.hdfs 1.5.9
HDFS integration for IBM Streams
18 versions - Latest release: over 3 years ago - 167 downloads last month - 9 stars on GitHub - 4 maintainers
sqoopit 0.0.12
A simple package to let you Sqoop into HDFS/Hive/HBase with python
1 version - Latest release: about 4 years ago - 1 dependent repositories - 16 downloads last month - 0 stars on GitHub - 1 maintainer
sqoopy 0.0.75
UNKNOWN
21 versions - Latest release: over 8 years ago - 2 dependent repositories - 86 downloads last month - 1 maintainer
spooq 3.4.0
Spooq is a PySpark based helper library for ETL data ingestion pipeline in Data Lakes.
11 versions - Latest release: 2 months ago - 1 dependent repositories - 19.3 thousand downloads last month - 8 stars on GitHub - 1 maintainer
sparkdh 0.0.1
1 version - Latest release: over 2 years ago - 1 dependent repositories - 6 downloads last month - 0 stars on GitHub - 1 maintainer
snakeriver 0.1.3
Another way to think about Hadoop Streaming in Python
4 versions - Latest release: almost 11 years ago - 2 dependent repositories - 15 downloads last month - 1 maintainer
Top 4.6% on pypi.org
snakebite-py3 3.0.5
Pure Python HDFS client
5 versions - Latest release: over 4 years ago - 4 dependent packages - 18 dependent repositories - 127 thousand downloads last month - 22 stars on GitHub - 1 maintainer
Top 4.4% on pypi.org
skein 0.8.2
A simple tool and library for deploying applications on Apache YARN
23 versions - Latest release: about 2 years ago - 2 dependent packages - 12 dependent repositories - 25.2 thousand downloads last month - 138 stars on GitHub - 1 maintainer
sdctool 0.11.0
Streamsets DataCollector API utility
3 versions - Latest release: about 6 years ago - 1 dependent repositories - 13 downloads last month - 13 stars on GitHub - 1 maintainer
pymongo_hadoop 1.1.0
UNKNOWN
2 versions - Latest release: about 11 years ago - 3 dependent repositories - 25 downloads last month - 1,522 stars on GitHub - 1 maintainer
pyhadoop 0.1
Python based hadoop command-line interface
1 version - Latest release: about 10 years ago - 2 dependent repositories - 96 downloads last month - 3 stars on GitHub - 1 maintainer
pydistcp 1.0.7
pydistcp: python WebHDFS inter/intra-cluster data copy tool.
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 21 downloads last month - 10 stars on GitHub - 1 maintainer
pycarbon-sdk 0.1.0
Pycarbon is a library that optimizes data access for AI based on CarbonData files, and it is bas...
1 version - Latest release: about 4 years ago - 1 dependent repositories - 26 downloads last month - 1,418 stars on GitHub - 1 maintainer
py4hdfs 0.0.1
Fast Queries to HDFS
1 version - Latest release: about 8 years ago - 2 dependent repositories - 9 downloads last month - 1 stars on GitHub - 1 maintainer
pomsets-core 1.0.9
workflow management for the cloud
10 versions - Latest release: over 13 years ago - 2 dependent repositories - 36 downloads last month - 1 maintainer
pomsets-gui 1.0.10
GUI for workflow management for the cloud
11 versions - Latest release: over 13 years ago - 2 dependent repositories - 42 downloads last month - 1 maintainer
odata2avro 1.0.0
Convert OData datasets to Avro
3 versions - Latest release: about 9 years ago - 2 dependent repositories - 15 downloads last month - 2 stars on GitHub - 1 maintainer
nutch 1.10.3 💰
Apache Nutch Python library
4 versions - Latest release: over 8 years ago - 2 dependent repositories - 40 downloads last month - 36 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
h2o-client 3.46.0.2
H2O, Fast Scalable Machine Learning, for python
13 versions - Latest release: 6 days ago - 3 dependent repositories - 587 downloads last month - 6,710 stars on GitHub - 1 maintainer
ls-thrift-py-hadoop 1-cdh4.3.0
Hadoop and Hive Python Thrift Libs
3 versions - Latest release: almost 11 years ago - 2 dependent repositories - 11 downloads last month - 10 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
hive-thrift-py 0.0.1
Hive Python Thrift Libs
1 version - Latest release: over 11 years ago - 32 dependent repositories - 2.43 thousand downloads last month - 1 maintainer
hivehoney 1.0.4
Client-less data retrieval from Hive.
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 18 downloads last month - 3 stars on GitHub - 1 maintainer
Top 7.8% on pypi.org
hbase-python 0.5
User friendly HBase client for Python 3. (Pure python implementation)
5 versions - Latest release: almost 6 years ago - 6 dependent repositories - 6.06 thousand downloads last month - 41 stars on GitHub - 1 maintainer
hadopy 0.1.8
Easy parallel map-reduce command line tool
8 versions - Latest release: about 3 years ago - 1 dependent repositories - 24 downloads last month - 8 stars on GitHub - 1 maintainer
hadoop-yarn-rest-api 1.1.0
Python wrapper for Hadoop YARN REST API
5 versions - Latest release: almost 5 years ago - 1 dependent repositories - 2.21 thousand downloads last month - 0 stars on GitHub - 1 maintainer
hadoop-protoseq 0.0.1
Python library for Hadoop Streaming with support of protobuf sequences
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 17 downloads last month - 1 stars on GitHub - 1 maintainer
hadoop-fs-wrapper 0.6.1
Python Wrapper for Hadoop Java API
8 versions - Latest release: 11 months ago - 1 dependent package - 1 dependent repositories - 2.67 thousand downloads last month - 3 stars on GitHub - 1 maintainer
hadeploy 0.6.1
An Hadoop Application deployment tool
12 versions - Latest release: over 5 years ago - 1 dependent repositories - 112 downloads last month - 10 stars on GitHub - 1 maintainer
Top 7.8% on pypi.org
dist-keras 0.2.1
Distributed Deep learning with Apache Spark with Keras.
3 versions - Latest release: over 6 years ago - 1 dependent repositories - 6.95 thousand downloads last month - 624 stars on GitHub - 1 maintainer
dbt-doris 0.3.4
The doris adapter plugin for dbt
8 versions - Latest release: 8 months ago - 1 dependent repositories - 132 downloads last month - 10,473 stars on GitHub - 1 maintainer
datagen 1.0.1
Generate delimited sample data with a simple schema.
1 version - Latest release: 9 months ago - 2 dependent repositories - 8 stars on GitHub - 1 maintainer
cyhdfs 0.1.3
Cython wrapper around libhdfs
2 versions - Latest release: over 11 years ago - 2 dependent repositories - 13 downloads last month - 1 maintainer
cornet 0.1.3
Easily generate Apache Sqoop commands based on YAML config file
5 versions - Latest release: almost 9 years ago - 2 dependent repositories - 44 downloads last month - 10 stars on GitHub - 1 maintainer
cobra-policytool 1.1.6
Tool for manage Hadoop access using Apache Atlas and Ranger.
9 versions - Latest release: almost 5 years ago - 1 dependent repositories - 142 downloads last month - 16 stars on GitHub - 1 maintainer
Top 7.7% on pypi.org
cluster-pack 0.3.7
A library on top of either pex or conda-packto make your Python code easily available on a cluster
44 versions - Latest release: 19 days ago - 2 dependent packages - 5 dependent repositories - 683 downloads last month - 43 stars on GitHub - 7 maintainers
clusterdock 2.3.0
clusterdock is a framework for creating Docker-based container clusters
24 versions - Latest release: almost 4 years ago - 1 dependent repositories - 341 downloads last month - 29 stars on GitHub - 3 maintainers
cirruscluster 0.0.1-17
A batteries-included MapReduce cluster-in-a-can for scientists, researchers, and engineers.
5 versions - Latest release: almost 11 years ago - 3 dependent repositories - 8 downloads last month - 2 stars on GitHub - 1 maintainer
bigdata 0.0.3
IPython magic for running Apache tools for Big Data
4 versions - Latest release: almost 5 years ago - 919 downloads last month - 1 maintainer
aiowebhdfs 0.0.2
A modern and asynchronous web client for WebHDFS
2 versions - Latest release: about 4 years ago - 39 downloads last month - 7 stars on GitHub - 1 maintainer
kiji-bento-cluster 2.0.10
CDH and Datastax Enterprise docker single-node development cluster.
6 versions - Latest release: over 9 years ago - 2 dependent repositories - 23 downloads last month - 1 maintainer
Top 4.8% on pypi.org
dask-gateway-server 2024.1.0 💰
A multi-tenant server for securely deploying and managing multiple Dask clusters.
22 versions - Latest release: 4 months ago - 1 dependent package - 10 dependent repositories - 6.32 thousand downloads last month - 128 stars on GitHub - 4 maintainers
jupyterhub-yarnspawner 0.4.0
JupyterHub Spawner for Apache Hadoop/YARN Clusters
4 versions - Latest release: almost 5 years ago - 1 dependent repositories - 69 downloads last month - 2 stars on GitHub - 1 maintainer
Top 2.3% on pypi.org
snakebite 2.11.0
Pure Python HDFS client
63 versions - Latest release: almost 8 years ago - 1 dependent package - 88 dependent repositories - 47.5 thousand downloads last month - 858 stars on GitHub - 1 maintainer
analytics-command-center 3.0.14
Command Center for Data Ingestion, Advanced Analytics and Artificial Intelligence process
1 version - Latest release: over 2 years ago - 26 downloads last month - 11 stars on GitHub - 1 maintainer
sparkpickle 1.0.1
Provides functions for reading SequenceFile-s with Python pickles.
2 versions - Latest release: over 7 years ago - 1 dependent repositories - 5.3 thousand downloads last month - 24 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
dispy 4.15.2
Distributed and Parallel Computing with/for Python.
80 versions - Latest release: over 1 year ago - 2 dependent packages - 21 dependent repositories - 397 downloads last month - 255 stars on GitHub - 2 maintainers
cassandralauncher 1.20
Command line utilities for launching Cassandra clusters in EC2
42 versions - Latest release: about 10 years ago - 48 downloads last month - 46 stars on GitHub - 1 maintainer
python_hiveish 1.1.0
A hive-like interface wrapper around Hadoopy that allows SQL like queries ontop of MapReduce dire...
2 versions - Latest release: over 8 years ago - 2 dependent repositories - 11 downloads last month - 1 stars on GitHub - 1 maintainer
python3-lzo-indexer 0.3.0
Library for indexing LZO compressed files
4 versions - Latest release: over 5 years ago - 1 dependent repositories - 69 downloads last month - 1 stars on GitHub - 1 maintainer
risk-command-center 1.0.37
Risk Command Center, manage your risk easly.
2 versions - Latest release: about 2 years ago - 1 dependent repositories - 10 downloads last month - 11 stars on GitHub - 1 maintainer
hcompressor 1.0.0 removed
Hcompressor is a tool to compress files in HDFS
1 version - Latest release: over 1 year ago - 15 downloads last month - 1 stars on GitHub - 1 maintainer