Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "bigdata" keyword

ckitoolz 0.1.1
Cython implementation of Kitoolz: High performance functional utilities
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 46 downloads last month - 2 maintainers
cuallee 0.10.1
Python library for data validation on DataFrame APIs including Snowflake/Snowpark, Apache/PySpark...
74 versions - Latest release: 2 days ago - 1 dependent package - 1 dependent repositories - 11.9 thousand downloads last month - 107 stars on GitHub - 2 maintainers
spark-celery 0.1.1
A helper to allow Python Celery tasks to do work in a Spark job
3 versions - Latest release: over 6 years ago - 1 dependent repositories - 360 downloads last month - 27 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
arvados-cwl-runner 2.7.2
Arvados Common Workflow Language runner
361 versions - Latest release: 23 days ago - 5 dependent repositories - 1.65 thousand downloads last month - 364 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
arvados-python-client 2.7.2
Arvados client library
689 versions - Latest release: 23 days ago - 11 dependent repositories - 6.32 thousand downloads last month - 363 stars on GitHub - 1 maintainer
arvados_fuse 2.7.2
Arvados FUSE driver
503 versions - Latest release: 23 days ago - 2 dependent repositories - 1.54 thousand downloads last month - 363 stars on GitHub - 1 maintainer
arvados-pam 2.0.4
Arvados PAM module
32 versions - Latest release: over 3 years ago - 99 downloads last month - 363 stars on GitHub - 1 maintainer
crunchstat_summary 2.7.2
Arvados crunchstat-summary reads crunch log files and summarizes resource usage
20 versions - Latest release: 23 days ago - 302 downloads last month - 363 stars on GitHub - 1 maintainer
pysparkify 0.26.3
Spark based ETL
12 versions - Latest release: 1 day ago - 511 downloads last month - 1 stars on GitHub - 2 maintainers
pycebes 0.10.2
Python client for Cebes HTTP server.
2 versions - Latest release: about 6 years ago - 1 dependent repositories - 13 downloads last month - 2 stars on GitHub - 2 maintainers
Top 1.1% on pypi.org
vispy 0.14.2
Interactive visualization in Python
38 versions - Latest release: about 2 months ago - 53 dependent packages - 287 dependent repositories - 92.2 thousand downloads last month - 3,206 stars on GitHub - 3 maintainers
Top 8.3% on pypi.org
visualpython 3.0.1 💰
Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook as an exten...
88 versions - Latest release: 6 months ago - 1 dependent repositories - 1.74 thousand downloads last month - 755 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
jupyterlab-visualpython 3.0.1 💰
GUI-based Python code generator for Jupyter Lab as an extension
22 versions - Latest release: 6 months ago - 4 dependent repositories - 1.09 thousand downloads last month - 755 stars on GitHub - 2 maintainers
exelog 0.0.1
Enabling meticulous logging for Spark Applications
1 version - Latest release: over 2 years ago - 1 dependent repositories - 18 downloads last month - 5 stars on GitHub - 2 maintainers
Top 9.4% on pypi.org
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.
32 versions - Latest release: over 1 year ago - 1 dependent repositories - 339 downloads last month - 1,441 stars on GitHub - 2 maintainers
pyspark-spy 1.0.2
Collect and aggregate on spark events for profitz. In 🐍 way!
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 218 downloads last month - 10 stars on GitHub - 2 maintainers
vulkn 19.0.10
The environmentally friendly petabyte scale Python eco-system built on Yandex ClickHouse
10 versions - Latest release: about 4 years ago - 1 dependent repositories - 78 downloads last month - 43 stars on GitHub - 2 maintainers
Top 1.3% on pypi.org
vaex-core 4.17.1
Core of vaex
101 versions - Latest release: 10 months ago - 15 dependent packages - 64 dependent repositories - 68.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-jupyter 0.8.2
Jupyter notebook and Jupyter lab support for vaex
23 versions - Latest release: 10 months ago - 4 dependent packages - 50 dependent repositories - 22.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
vaex-server 0.9.0
Webserver and client for vaex for a remote dataset
21 versions - Latest release: 10 months ago - 3 dependent packages - 51 dependent repositories - 20.8 thousand downloads last month - 8,171 stars on GitHub - 2 maintainers
bdtool 9.0.0
script of managing some bigdata technology stack
11 versions - Latest release: about 2 years ago - 1 dependent repositories - 29 downloads last month - 0 stars on GitHub - 1 maintainer
legend-pydataobj 1.6.2
LEGEND Python Data Objects
17 versions - Latest release: 1 day ago - 3 dependent packages - 1.98 thousand downloads last month - 0 stars on GitHub - 2 maintainers
mextractor 5.0.0 💰
mextractor can extract media metadata to YAML and read them
16 versions - Latest release: about 1 month ago - 117 downloads last month - 4 stars on GitHub - 2 maintainers
zqy 1.0.5
belongs to zqy.
4 versions - Latest release: over 6 years ago - 1 dependent repositories - 45 downloads last month - 2 maintainers
Top 1.5% on pypi.org
vaex-hdf5 0.14.1
hdf5 file support for vaex
36 versions - Latest release: over 1 year ago - 4 dependent packages - 58 dependent repositories - 26.9 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
robotframework-dynamodbsqllibrary 0.3.1
An Amazon AWS DynamoDB big data testing library for Robot Framework with SQL-like DSL
9 versions - Latest release: about 1 year ago - 2 dependent repositories - 1.01 thousand downloads last month - 4 stars on GitHub - 1 maintainer
bart-extract-ga 0.1.1
Extract event views to Google Analytics
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
Top 4.3% on pypi.org
resiliparse 0.14.7
A collection of robust and fast processing tools for parsing and analyzing (not only) web archive...
66 versions - Latest release: 3 days ago - 2 dependent packages - 4 dependent repositories - 19.9 thousand downloads last month - 42 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
fastwarc 0.14.7
A high-performance WARC parsing library for Python written in C++/Cython.
72 versions - Latest release: 3 days ago - 4 dependent packages - 5 dependent repositories - 199 thousand downloads last month - 42 stars on GitHub - 1 maintainer
Top 0.3% on pypi.org
avro 1.11.3
Avro is a serialization and RPC framework.
32 versions - Latest release: 7 months ago - 46 dependent packages - 1,066 dependent repositories - 7.1 million downloads last month - 2,755 stars on GitHub - 8 maintainers
Top 0.4% on pypi.org
avro-python3 1.10.2
Avro is a serialization and RPC framework.
12 versions - Latest release: about 3 years ago - 34 dependent packages - 742 dependent repositories - 6.84 million downloads last month - 2,755 stars on GitHub - 7 maintainers
bigdatasml 0.1.3
This package calculates average student performances
3 versions - Latest release: over 2 years ago - 40 downloads last month - 2 maintainers
akudu 0.0.1
Asyncio Client for Apache Kudu Database Engine (Storage System).
1 version - Latest release: about 1 year ago - 19 downloads last month - 1 stars on GitHub - 2 maintainers
Top 2.2% on pypi.org
uproot 5.3.3
ROOT I/O in pure Python and NumPy.
300 versions - Latest release: 21 days ago - 69 dependent packages - 240 dependent repositories - 235 thousand downloads last month - 218 stars on GitHub - 2 maintainers
Top 1.5% on pypi.org
vaex-viz 0.5.4
Visualization for vaex
24 versions - Latest release: over 1 year ago - 3 dependent packages - 54 dependent repositories - 21.5 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
vaex-astro 0.9.3
Astronomy related transformations and FITS file support
18 versions - Latest release: over 1 year ago - 2 dependent packages - 51 dependent repositories - 20.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
datacatalog-custom-entries-manager 0.1.2
A package to manage Google Cloud Data Catalog custom entries
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 57 downloads last month - 2 stars on GitHub - 2 maintainers
elixirnote 4.0.0a30
go from data to knowledge
4 versions - Latest release: over 1 year ago - 19 downloads last month - 10 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
vaex 4.17.0
Out-of-Core DataFrames to visualize and explore big tabular datasets
58 versions - Latest release: 10 months ago - 20 dependent packages - 90 dependent repositories - 21.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
vaex-ml 0.18.3
Machine learning support for vaex
34 versions - Latest release: 10 months ago - 2 dependent packages - 47 dependent repositories - 20.7 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
cytoolz 0.12.3
Cython implementation of Toolz: High performance functional utilities
23 versions - Latest release: 3 months ago - 88 dependent packages - 11,437 dependent repositories - 2.36 million downloads last month - 964 stars on GitHub - 1 maintainer
bart-simulator 0.2.1
Send event views to Google Analytics and Generator customers or products
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 47 downloads last month - 0 stars on GitHub - 1 maintainer
sysxtract 1.0.0
Extract logs based off events from sysmon. Comes as a package, cli and ui.
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 9 downloads last month - 3 stars on GitHub - 2 maintainers
Top 6.2% on pypi.org
vaex-graphql 0.2.0
GraphQL support for accessing vaex DataFrame
3 versions - Latest release: about 3 years ago - 9 dependent repositories - 171 downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 2.7% on pypi.org
uproot3 3.14.4
ROOT I/O in pure Python and Numpy.
5 versions - Latest release: about 3 years ago - 13 dependent packages - 30 dependent repositories - 14.5 thousand downloads last month - 314 stars on GitHub - 1 maintainer
zqybxqqwetest 1.0.1
belongs to zqy.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 16 downloads last month - 2 maintainers
Top 4.4% on pypi.org
nflx-genie-client 3.6.17
Genie Python Client.
102 versions - Latest release: 9 months ago - 2 dependent repositories - 78.3 thousand downloads last month - 1,679 stars on GitHub - 6 maintainers
foobar22 1.2.0 removed
This package is used for security research and demonstrations. It might contain dangerous code sn...
1 version - Latest release: almost 2 years ago - 2 dependent repositories - 598 stars on GitHub
ustore 1.7.5
Python bindings for Unum's UStore.
21 versions - Latest release: almost 2 years ago - 1 dependent repositories - 27 downloads last month - 478 stars on GitHub - 2 maintainers
ukv 0.12.1
Python bindings for Unum's UStore.
23 versions - Latest release: about 1 year ago - 145 downloads last month - 478 stars on GitHub - 1 maintainer
vaex-distributed 0.3.0
Distributed dataset for vaex
3 versions - Latest release: about 5 years ago - 1 dependent repositories - 18 downloads last month - 8,171 stars on GitHub - 2 maintainers
athenacli 1.6.8
CLI for Athena Database. With auto-completion and syntax highlighting.
24 versions - Latest release: almost 2 years ago - 1 dependent repositories - 1.21 thousand downloads last month - 206 stars on GitHub - 2 maintainers
pybda 0.1.0
Analysis of big biological data sets for distributed HPC clusters.
6 versions - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 9 stars on GitHub - 2 maintainers
gigapipe 0.1.22
Gigapipe Python Client
13 versions - Latest release: over 1 year ago - 1 dependent repositories - 73 downloads last month - 2 maintainers
fdict 0.8.1
Easy out-of-core computing of recursive dict
9 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 14.5 thousand downloads last month - 7 stars on GitHub - 2 maintainers
cybershuttle-sdk 0.0.1a4
Python SDK for Apache Airavata Cybershuttle
3 versions - Latest release: 7 months ago - 26 downloads last month - 98 stars on GitHub - 2 maintainers
unisonctl 0.1
A frontend for unison allowing it to handle some read-only large datasets more efficently.
1 version - Latest release: over 6 years ago - 1 dependent repositories - 10 downloads last month - 2 stars on GitHub - 2 maintainers
pysparkproxy 0.0.17
Seamlessly execute pyspark code on remote clusters
9 versions - Latest release: over 5 years ago - 1 dependent repositories - 43 downloads last month - 4 stars on GitHub - 2 maintainers
risk-command-center 1.0.37
Risk Command Center, manage your risk easly.
2 versions - Latest release: almost 2 years ago - 1 dependent repositories - 10 downloads last month - 11 stars on GitHub - 1 maintainer
airavata-django-portal-sdk 1.8.4
The Airavata Django Portal SDK is a library that makes it easier to develop Airavata Django Porta...
33 versions - Latest release: 10 months ago - 3 dependent repositories - 185 downloads last month - 0 stars on GitHub - 3 maintainers
wendelin.core 0.2
Out-of-core NumPy arrays
16 versions - Latest release: 9 months ago - 2 dependent repositories - 38 downloads last month - 2 maintainers
hdxcli 1.0rc51
Hydrolix command line utility to do CRUD operations on projects, tables, transforms and other res...
32 versions - Latest release: about 1 month ago - 142 downloads last month - 6 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
pybloom-live 4.0.0
Bloom filter: A Probabilistic data structure
7 versions - Latest release: over 1 year ago - 4 dependent packages - 65 dependent repositories - 657 thousand downloads last month - 158 stars on GitHub - 2 maintainers
Top 4.8% on pypi.org
uproot4 4.0.0
ROOT I/O in pure Python and NumPy.
30 versions - Latest release: over 3 years ago - 10 dependent packages - 11 dependent repositories - 20.7 thousand downloads last month - 199 stars on GitHub - 4 maintainers
Top 9.3% on pypi.org
anovos 1.1.0
An Open Source tool for Feature Engineering in Machine Learning
8 versions - Latest release: over 1 year ago - 2 dependent repositories - 1.56 thousand downloads last month - 77 stars on GitHub - 1 maintainer
zqybxqtest 1.0.1
belongs to zqy.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 3 downloads last month - 2 maintainers
Top 9.6% on pypi.org
vaex-ui 0.3.0
Graphical user interface for vaex based on Qt
7 versions - Latest release: over 4 years ago - 3 dependent repositories - 39 downloads last month - 8,171 stars on GitHub - 2 maintainers
dtshare 1.0.3
Open financial data
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 28 downloads last month - 514 stars on GitHub - 2 maintainers
Top 8.1% on pypi.org
datafaker 0.7.6
A tool for generating batch test data or stream data.
51 versions - Latest release: over 2 years ago - 2 dependent repositories - 158 downloads last month - 607 stars on GitHub - 2 maintainers
datacatalog-fileset-processor 0.1.5
A package to manage Google Cloud Data Catalog Fileset scripts
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 160 downloads last month - 3 stars on GitHub - 2 maintainers
spark-yarn-submit 1.0.0
library to handle spark job submit in a yarn cluster in different environment
1 version - Latest release: about 7 years ago - 1 dependent repositories - 3 downloads last month - 3 stars on GitHub - 2 maintainers
grapetree 1.6.1
Web interface of GrapeTree, which is a program for phylogenetic analysis.
16 versions - Latest release: over 4 years ago - 1 dependent repositories - 109 downloads last month - 74 stars on GitHub - 6 maintainers
datacatalog-tag-manager 2.2.0
A package to manage Google Cloud Data Catalog tags, loading metadata from external sources
15 versions - Latest release: over 3 years ago - 1 dependent repositories - 249 downloads last month - 17 stars on GitHub - 2 maintainers
lemuras 1.2.3
A small Python library to deal with big tables
6 versions - Latest release: about 4 years ago - 1 dependent repositories - 12 downloads last month - 3 stars on GitHub - 2 maintainers
package-lab-vr-demo-success 99.99.99
disclaimer, This package is used for security research and demonstrations. It might contain dange...
1 version - Latest release: almost 2 years ago - 1 dependent repositories - 4 downloads last month - 3,355 stars on GitHub - 1 maintainer
mister 0.0.2
Approachable map/reduce jobs
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 12 downloads last month - 0 stars on GitHub - 2 maintainers
dlopes7-avro 1.12.0
Avro is a serialization and RPC framework.
1 version - Latest release: 6 months ago - 20 downloads last month - 2,706 stars on GitHub - 1 maintainer
vector-lake 0.0.4
S3 vector database for bigdata
4 versions - Latest release: 9 months ago - 21 downloads last month - 19 stars on GitHub - 2 maintainers
django-mass-migration 0.2.9
Django app for long-running data migrations
16 versions - Latest release: 4 months ago - 1.11 thousand downloads last month - 2 stars on GitHub - 2 maintainers
taospyudf 0.0.11
taos python udf
9 versions - Latest release: 12 months ago - 589 downloads last month - 22,767 stars on GitHub - 1 maintainer
datacatalog-custom-model-manager 0.1.1
A package to load user-specified metadata models into Google Cloud Data Catalog
2 versions - Latest release: over 3 years ago - 1 dependent repositories - 8 downloads last month - 6 stars on GitHub - 2 maintainers
datacatalog-util 0.11.6
A package to manage Google Cloud Data Catalog helper commands and scripts
21 versions - Latest release: over 3 years ago - 1 dependent repositories - 172 downloads last month - 20 stars on GitHub - 2 maintainers
Top 8.1% on pypi.org
km3io 1.1.0
"KM3NeT I/O library without ROOT"
67 versions - Latest release: about 2 months ago - 2 dependent packages - 2 dependent repositories - 751 downloads last month - 314 stars on GitHub - 4 maintainers
databend 1.2.344
Databend Python Binding
259 versions - Latest release: 2 months ago - 368 downloads last month - 7,150 stars on GitHub - 4 maintainers
elasticsearch-partition 2.0.0
A Python library for creating Elasticsearch partitioned indexes by date range
5 versions - Latest release: about 5 years ago - 1 dependent repositories - 12 downloads last month - 4 stars on GitHub - 2 maintainers
dpark 0.5.0
Python clone of Spark, MapReduce like computing framework supporting iterative algorithms.
19 versions - Latest release: almost 6 years ago - 1 dependent repositories - 28 downloads last month - 2,693 stars on GitHub - 4 maintainers
panel-vegafusion 0.0.3
Build interactive big data apps with Altair and Vega easily using Panel + VegaFusion.
3 versions - Latest release: over 2 years ago - 1 dependent repositories - 24 downloads last month - 14 stars on GitHub - 2 maintainers
hivehoney 1.0.4
Client-less data retrieval from Hive.
5 versions - Latest release: about 5 years ago - 1 dependent repositories - 18 downloads last month - 3 stars on GitHub - 2 maintainers
vaex-contrib 0.1.3
Community contributed modules to vaex
3 versions - Latest release: over 1 year ago - 1 dependent repositories - 17 downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
optimuspyspark 2.2.32
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion wi...
83 versions - Latest release: almost 4 years ago - 8 dependent repositories - 8.75 thousand downloads last month - 1,439 stars on GitHub - 4 maintainers
workflux 0.5.0
A platform-agnostic, cloud-ready framework for simplified deployment of the Common Workflow Langu...
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 11 downloads last month - 32 stars on GitHub - 2 maintainers
pytispark 1.0.1
TiSpark support for python
8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 49 downloads last month - 878 stars on GitHub - 3 maintainers
heppyness 2.1.0
ROOT I/O in pure Python and NumPy.
18 versions - Latest release: 22 days ago - 1 dependent repositories - 206 downloads last month - 314 stars on GitHub - 2 maintainers
airconditioner 0.9.2
Yaml based DAG configurator for airflow
1 version - Latest release: over 4 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
timehash 1.2
Module to encode and decode timestamps to/from TimeHashes
3 versions - Latest release: over 2 years ago - 2 dependent repositories - 609 downloads last month - 37 stars on GitHub - 2 maintainers
pscli 1.1.1
Unofficial Cisco Parstream Client with improved cli
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 14 downloads last month - 2 stars on GitHub - 2 maintainers
datacatalog-fileset-enricher 1.2.0
A package for enriching the content of a fileset Entry with Datacatalog Tags
8 versions - Latest release: about 4 years ago - 1 dependent repositories - 147 downloads last month - 4 stars on GitHub - 2 maintainers
Top 5.0% on pypi.org
bigartm 0.9.2
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 134 downloads last month - 661 stars on GitHub - 3 maintainers
bigartm10 0.10.1
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent repositories - 92 downloads last month - 661 stars on GitHub - 4 maintainers
bigartm9 0.9.2 removed
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 1 year ago - 18 downloads last month - 630 stars on GitHub - 1 maintainer