Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "bigdata" keyword

cybershuttle-sdk 0.0.1a4
Python SDK for Apache Airavata Cybershuttle
3 versions - Latest release: 8 months ago - 22 downloads last month - 104 stars on GitHub - 1 maintainer
pscli 1.1.1
Unofficial Cisco Parstream Client with improved cli
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 12 downloads last month - 2 stars on GitHub - 1 maintainer
airavata-django-portal-sdk 1.8.4
The Airavata Django Portal SDK is a library that makes it easier to develop Airavata Django Porta...
33 versions - Latest release: 11 months ago - 3 dependent repositories - 250 downloads last month - 0 stars on GitHub - 3 maintainers
Top 9.4% on pypi.org
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.
32 versions - Latest release: over 1 year ago - 1 dependent repositories - 277 downloads last month - 1,447 stars on GitHub - 2 maintainers
legend-pydataobj 1.7.0
LEGEND Python Data Objects
18 versions - Latest release: about 1 month ago - 3 dependent packages - 808 downloads last month - 0 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
visualpython 3.0.2 💰
Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook as an exten...
89 versions - Latest release: 27 days ago - 1 dependent repositories - 2.34 thousand downloads last month - 755 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
jupyterlab-visualpython 3.0.2 💰
GUI-based Python code generator for Jupyter Lab as an extension
23 versions - Latest release: 27 days ago - 4 dependent repositories - 1.59 thousand downloads last month - 755 stars on GitHub - 1 maintainer
lettria 6.0.2
Lettria official SDK for python
31 versions - Latest release: 9 months ago - 1 dependent repositories - 1.2 thousand downloads last month - 11 stars on GitHub - 1 maintainer
databend 1.2.453
Databend Python Binding
260 versions - Latest release: about 1 month ago - 858 downloads last month - 7,150 stars on GitHub - 3 maintainers
bigartm10 0.10.1
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent repositories - 139 downloads last month - 662 stars on GitHub - 2 maintainers
bigartm9 0.9.2 removed
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 1 year ago - 18 downloads last month - 630 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
bigartm 0.9.2
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 582 downloads last month - 662 stars on GitHub - 2 maintainers
Top 3.3% on pypi.org
fastwarc 0.14.7
A high-performance WARC parsing library for Python written in C++/Cython.
72 versions - Latest release: about 1 month ago - 6 dependent packages - 5 dependent repositories - 147 thousand downloads last month - 43 stars on GitHub - 1 maintainer
Top 4.3% on pypi.org
resiliparse 0.14.7
A collection of robust and fast processing tools for parsing and analyzing (not only) web archive...
66 versions - Latest release: about 1 month ago - 2 dependent packages - 4 dependent repositories - 1.86 thousand downloads last month - 43 stars on GitHub - 1 maintainer
pysparkgateway 0.0.22
Connect Pyspark to remote clusters
18 versions - Latest release: over 3 years ago - 1 dependent repositories - 18.1 thousand downloads last month - 3 stars on GitHub - 1 maintainer
objetive 0.6
A mini-crawler that aims to grab some text parts from some website or ip that responds http*
6 versions - Latest release: over 4 years ago - 3 dependent repositories - 166 downloads last month - 0 stars on GitHub - 1 maintainer
Top 2.2% on pypi.org
uproot 5.3.7
ROOT I/O in pure Python and NumPy.
305 versions - Latest release: about 1 month ago - 83 dependent packages - 240 dependent repositories - 147 thousand downloads last month - 224 stars on GitHub - 2 maintainers
iterdict 0.2.0
Dict that lazily populates itself with items from the iterator it was constructed with as keys ar...
2 versions - Latest release: about 12 years ago - 1 dependent repositories - 8 downloads last month - 2 stars on GitHub - 1 maintainer
alphareader 0.0.7
A reader for large files with custom delimiters and encodings
7 versions - Latest release: about 4 years ago - 1 dependent repositories - 46 downloads last month - 5 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
cytoolz 0.12.3
Cython implementation of Toolz: High performance functional utilities
23 versions - Latest release: 5 months ago - 162 dependent packages - 11,437 dependent repositories - 2.47 million downloads last month - 964 stars on GitHub - 1 maintainer
zqytest 1.0.1
belongs to zqy.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 62 downloads last month - 1 maintainer
analytics-command-center 3.0.14
Command Center for Data Ingestion, Advanced Analytics and Artificial Intelligence process
1 version - Latest release: over 2 years ago - 14 downloads last month - 11 stars on GitHub - 1 maintainer
Top 0.4% on pypi.org
avro-python3 1.10.2
Avro is a serialization and RPC framework.
12 versions - Latest release: about 3 years ago - 37 dependent packages - 742 dependent repositories - 6.76 million downloads last month - 2,783 stars on GitHub - 7 maintainers
Top 0.3% on pypi.org
avro 1.11.3
Avro is a serialization and RPC framework.
32 versions - Latest release: 9 months ago - 59 dependent packages - 1,066 dependent repositories - 7.04 million downloads last month - 2,783 stars on GitHub - 8 maintainers
Top 1.1% on pypi.org
vispy 0.14.2
Interactive visualization in Python
38 versions - Latest release: 3 months ago - 73 dependent packages - 287 dependent repositories - 90.3 thousand downloads last month - 3,206 stars on GitHub - 3 maintainers
hadeploy 0.6.1
An Hadoop Application deployment tool
12 versions - Latest release: over 5 years ago - 1 dependent repositories - 25 downloads last month - 10 stars on GitHub - 1 maintainer
Top 4.3% on pypi.org
vaex-arrow 0.5.1
Arrow support for vaex
12 versions - Latest release: about 4 years ago - 1 dependent package - 17 dependent repositories - 1.04 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
pytispark 1.0.1
TiSpark support for python
8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 39 downloads last month - 877 stars on GitHub - 3 maintainers
pd-helper 1.0.0
A helpful script to optimize a Pandas DataFrame.
24 versions - Latest release: about 3 years ago - 1 dependent repositories - 133 downloads last month - 6 stars on GitHub - 1 maintainer
Top 1.3% on pypi.org
vaex-core 4.17.1
Core of vaex
101 versions - Latest release: 11 months ago - 16 dependent packages - 64 dependent repositories - 48.8 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-server 0.9.0
Webserver and client for vaex for a remote dataset
21 versions - Latest release: 11 months ago - 3 dependent packages - 51 dependent repositories - 20.5 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-jupyter 0.8.2
Jupyter notebook and Jupyter lab support for vaex
23 versions - Latest release: 11 months ago - 4 dependent packages - 50 dependent repositories - 20.9 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
cwlab 0.4.1
A platform-agnostic, cloud-ready framework for simplified deployment of the Common Workflow Langu...
10 versions - Latest release: about 3 years ago - 1 dependent repositories - 24 downloads last month - 32 stars on GitHub - 1 maintainer
django-mass-migration 0.2.9
Django app for long-running data migrations
17 versions - Latest release: 5 months ago - 1.47 thousand downloads last month - 2 stars on GitHub - 1 maintainer
bigdatacloudapi-client 1.0.3
A Python client for BigDataCloud API connectivity (https://www.bigdatacloud.com)
4 versions - Latest release: 9 months ago - 186 downloads last month - 10 stars on GitHub - 1 maintainer
ckitoolz 0.1.1
Cython implementation of Kitoolz: High performance functional utilities
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 46 downloads last month - 1 maintainer
exelog 0.0.1
Enabling meticulous logging for Spark Applications
1 version - Latest release: over 2 years ago - 1 dependent repositories - 14 downloads last month - 5 stars on GitHub - 1 maintainer
pycebes 0.10.2
Python client for Cebes HTTP server.
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 13 downloads last month - 2 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-hdf5 0.14.1
hdf5 file support for vaex
36 versions - Latest release: over 1 year ago - 4 dependent packages - 58 dependent repositories - 26.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
spark-celery 0.1.1
A helper to allow Python Celery tasks to do work in a Spark job
3 versions - Latest release: over 6 years ago - 1 dependent repositories - 360 downloads last month - 27 stars on GitHub - 1 maintainer
pyspark-spy 1.0.2
Collect and aggregate on spark events for profitz. In 🐍 way!
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 207 downloads last month - 10 stars on GitHub - 1 maintainer
bdtool 9.0.0
script of managing some bigdata technology stack
11 versions - Latest release: about 2 years ago - 1 dependent repositories - 17 downloads last month - 0 stars on GitHub - 1 maintainer
vulkn 19.0.10
The environmentally friendly petabyte scale Python eco-system built on Yandex ClickHouse
10 versions - Latest release: over 4 years ago - 1 dependent repositories - 95 downloads last month - 43 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-viz 0.5.4
Visualization for vaex
24 versions - Latest release: over 1 year ago - 3 dependent packages - 54 dependent repositories - 20.8 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
vaex-astro 0.9.3
Astronomy related transformations and FITS file support
18 versions - Latest release: over 1 year ago - 2 dependent packages - 51 dependent repositories - 20.4 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
mextractor 5.0.0 💰
mextractor can extract media metadata to YAML and read them
16 versions - Latest release: 2 months ago - 51 downloads last month - 4 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
vaex 4.17.0
Out-of-Core DataFrames to visualize and explore big tabular datasets
58 versions - Latest release: 11 months ago - 24 dependent packages - 90 dependent repositories - 22.3 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
vaex-ml 0.18.3
Machine learning support for vaex
34 versions - Latest release: 11 months ago - 2 dependent packages - 47 dependent repositories - 20.8 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
zqy 1.0.5
belongs to zqy.
4 versions - Latest release: over 6 years ago - 1 dependent repositories - 33 downloads last month - 1 maintainer
robotframework-dynamodbsqllibrary 0.3.1
An Amazon AWS DynamoDB big data testing library for Robot Framework with SQL-like DSL
9 versions - Latest release: over 1 year ago - 2 dependent repositories - 1.33 thousand downloads last month - 4 stars on GitHub - 1 maintainer
bart-extract-ga 0.1.1
Extract event views to Google Analytics
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 14 downloads last month - 0 stars on GitHub - 1 maintainer
akudu 0.0.1
Asyncio Client for Apache Kudu Database Engine (Storage System).
1 version - Latest release: over 1 year ago - 16 downloads last month - 1 stars on GitHub - 1 maintainer
harlequin-databend
A Harlequin adapter for Databend.
3 versions - 183 downloads last month - 7,150 stars on GitHub - 1 maintainer
package-lab-vr-demo-success 99.99.99
disclaimer, This package is used for security research and demonstrations. It might contain dange...
1 version - Latest release: about 2 years ago - 1 dependent repositories - 7 downloads last month - 3,382 stars on GitHub - 1 maintainer
bigdatasml 0.1.3
This package calculates average student performances
3 versions - Latest release: over 2 years ago - 22 downloads last month - 1 maintainer
datacatalog-custom-entries-manager 0.1.2
A package to manage Google Cloud Data Catalog custom entries
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 31 downloads last month - 2 stars on GitHub - 1 maintainer
elixirnote 4.0.0a30
go from data to knowledge
4 versions - Latest release: over 1 year ago - 27 downloads last month - 10 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
arvados-python-client 2.7.2
Arvados client library
690 versions - Latest release: 2 months ago - 11 dependent repositories - 6.93 thousand downloads last month - 365 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
arvados-cwl-runner 2.7.2
Arvados Common Workflow Language runner
362 versions - Latest release: 2 months ago - 5 dependent repositories - 1.56 thousand downloads last month - 365 stars on GitHub - 1 maintainer
bart-simulator 0.2.1
Send event views to Google Analytics and Generator customers or products
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 27 downloads last month - 0 stars on GitHub - 1 maintainer
sysxtract 1.0.0
Extract logs based off events from sysmon. Comes as a package, cli and ui.
1 version - Latest release: about 4 years ago - 1 dependent repositories - 14 downloads last month - 3 stars on GitHub - 1 maintainer
zqybxqqwetest 1.0.1
belongs to zqy.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 9 downloads last month - 1 maintainer
ukv 0.12.1
Python bindings for Unum's UStore.
23 versions - Latest release: about 1 year ago - 99 downloads last month - 488 stars on GitHub - 1 maintainer
Top 2.7% on pypi.org
uproot3 3.14.4
ROOT I/O in pure Python and Numpy.
5 versions - Latest release: over 3 years ago - 15 dependent packages - 30 dependent repositories - 44.7 thousand downloads last month - 314 stars on GitHub - 1 maintainer
ustore 1.7.5
Python bindings for Unum's UStore.
21 versions - Latest release: almost 2 years ago - 1 dependent repositories - 38 downloads last month - 488 stars on GitHub - 1 maintainer
pybda 0.1.0
Analysis of big biological data sets for distributed HPC clusters.
6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 33 downloads last month - 9 stars on GitHub - 1 maintainer
vaex-distributed 0.3.0
Distributed dataset for vaex
3 versions - Latest release: about 5 years ago - 1 dependent repositories - 26 downloads last month - 8,171 stars on GitHub - 1 maintainer
athenacli 1.6.8
CLI for Athena Database. With auto-completion and syntax highlighting.
24 versions - Latest release: about 2 years ago - 1 dependent repositories - 1.27 thousand downloads last month - 205 stars on GitHub - 2 maintainers
fdict 0.8.1
Easy out-of-core computing of recursive dict
9 versions - Latest release: almost 7 years ago - 1 dependent package - 1 dependent repositories - 11.1 thousand downloads last month - 7 stars on GitHub - 1 maintainer
gigapipe 0.1.22
Gigapipe Python Client
13 versions - Latest release: almost 2 years ago - 1 dependent repositories - 92 downloads last month - 1 maintainer
pysparkproxy 0.0.17
Seamlessly execute pyspark code on remote clusters
9 versions - Latest release: over 5 years ago - 1 dependent repositories - 28 downloads last month - 4 stars on GitHub - 1 maintainer
unisonctl 0.1
A frontend for unison allowing it to handle some read-only large datasets more efficently.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 6 downloads last month - 2 stars on GitHub - 1 maintainer
arvados_fuse 2.7.2
Arvados FUSE driver
503 versions - Latest release: 2 months ago - 2 dependent repositories - 1.28 thousand downloads last month - 365 stars on GitHub - 1 maintainer
arvados-pam 2.0.4
Arvados PAM module
32 versions - Latest release: almost 4 years ago - 68 downloads last month - 365 stars on GitHub - 1 maintainer
crunchstat_summary 2.7.2
Arvados crunchstat-summary reads crunch log files and summarizes resource usage
20 versions - Latest release: 2 months ago - 92 downloads last month - 365 stars on GitHub - 1 maintainer
wendelin.core 0.2
Out-of-core NumPy arrays
16 versions - Latest release: 10 months ago - 2 dependent repositories - 64 downloads last month - 1 maintainer
risk-command-center 1.0.37
Risk Command Center, manage your risk easly.
2 versions - Latest release: about 2 years ago - 1 dependent repositories - 19 downloads last month - 11 stars on GitHub - 1 maintainer
hdxcli 1.0rc51
Hydrolix command line utility to do CRUD operations on projects, tables, transforms and other res...
32 versions - Latest release: 3 months ago - 251 downloads last month - 6 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
uproot4 4.0.0
ROOT I/O in pure Python and NumPy.
30 versions - Latest release: over 3 years ago - 10 dependent packages - 11 dependent repositories - 44.8 thousand downloads last month - 199 stars on GitHub - 2 maintainers
dlopes7-avro 1.12.0
Avro is a serialization and RPC framework.
1 version - Latest release: 8 months ago - 39 downloads last month - 2,755 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
anovos 1.1.0
An Open Source tool for Feature Engineering in Machine Learning
8 versions - Latest release: over 1 year ago - 2 dependent repositories - 2.01 thousand downloads last month - 77 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
vaex-ui 0.3.0
Graphical user interface for vaex based on Qt
7 versions - Latest release: over 4 years ago - 3 dependent repositories - 38 downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
datafaker 0.7.6
A tool for generating batch test data or stream data.
51 versions - Latest release: almost 3 years ago - 2 dependent repositories - 282 downloads last month - 607 stars on GitHub - 1 maintainer
zqybxqtest 1.0.1
belongs to zqy.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 9 downloads last month - 1 maintainer
dtshare 1.0.3
Open financial data
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 35 downloads last month - 513 stars on GitHub - 1 maintainer
datacatalog-fileset-processor 0.1.5
A package to manage Google Cloud Data Catalog Fileset scripts
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 206 downloads last month - 3 stars on GitHub - 1 maintainer
spark-yarn-submit 1.0.0
library to handle spark job submit in a yarn cluster in different environment
1 version - Latest release: over 7 years ago - 1 dependent repositories - 12 downloads last month - 3 stars on GitHub - 1 maintainer
grapetree 1.6.1
Web interface of GrapeTree, which is a program for phylogenetic analysis.
16 versions - Latest release: over 4 years ago - 1 dependent repositories - 210 downloads last month - 75 stars on GitHub - 3 maintainers
cuallee 0.10.2
Python library for data validation on DataFrame APIs including Snowflake/Snowpark, Apache/PySpark...
76 versions - Latest release: about 1 month ago - 1 dependent package - 1 dependent repositories - 11.6 thousand downloads last month - 111 stars on GitHub - 2 maintainers
lemuras 1.2.3
A small Python library to deal with big tables
6 versions - Latest release: about 4 years ago - 1 dependent repositories - 68 downloads last month - 3 stars on GitHub - 1 maintainer
mister 0.0.2
Approachable map/reduce jobs
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 23 downloads last month - 0 stars on GitHub - 1 maintainer
vector-lake 0.0.4
S3 vector database for bigdata
4 versions - Latest release: 10 months ago - 30 downloads last month - 20 stars on GitHub - 1 maintainer
taospyudf 0.0.11
taos python udf
9 versions - Latest release: about 1 year ago - 926 downloads last month - 22,774 stars on GitHub - 1 maintainer
datacatalog-custom-model-manager 0.1.1
A package to load user-specified metadata models into Google Cloud Data Catalog
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 39 downloads last month - 6 stars on GitHub - 1 maintainer
datacatalog-util 0.11.6
A package to manage Google Cloud Data Catalog helper commands and scripts
21 versions - Latest release: over 3 years ago - 1 dependent repositories - 354 downloads last month - 20 stars on GitHub - 1 maintainer
Top 4.4% on pypi.org
nflx-genie-client 3.6.17
Genie Python Client.
102 versions - Latest release: 10 months ago - 2 dependent repositories - 80.4 thousand downloads last month - 1,679 stars on GitHub - 3 maintainers
Top 3.7% on pypi.org
pybloom-live 4.0.0
Bloom filter: A Probabilistic data structure
7 versions - Latest release: over 1 year ago - 5 dependent packages - 65 dependent repositories - 1.37 million downloads last month - 157 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
km3io 1.1.0
"KM3NeT I/O library without ROOT"
67 versions - Latest release: 3 months ago - 2 dependent packages - 2 dependent repositories - 789 downloads last month - 314 stars on GitHub - 2 maintainers
pysparkify 0.27.0
Spark based ETL
18 versions - Latest release: 27 days ago - 1.01 thousand downloads last month - 1 stars on GitHub - 1 maintainer
elasticsearch-partition 2.0.0
A Python library for creating Elasticsearch partitioned indexes by date range
5 versions - Latest release: about 5 years ago - 1 dependent repositories - 68 downloads last month - 4 stars on GitHub - 1 maintainer