Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
pypi.org "bigdata" keyword
ckitoolz 0.1.1
Cython implementation of Kitoolz: High performance functional utilities3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 46 downloads last month - 2 maintainers
cuallee 0.10.1
Python library for data validation on DataFrame APIs including Snowflake/Snowpark, Apache/PySpark...74 versions - Latest release: 2 days ago - 1 dependent package - 1 dependent repositories - 11.9 thousand downloads last month - 107 stars on GitHub - 2 maintainers
spark-celery 0.1.1
A helper to allow Python Celery tasks to do work in a Spark job3 versions - Latest release: over 6 years ago - 1 dependent repositories - 360 downloads last month - 27 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
361 versions - Latest release: 23 days ago - 5 dependent repositories - 1.65 thousand downloads last month - 364 stars on GitHub - 1 maintainer
arvados-cwl-runner 2.7.2
Arvados Common Workflow Language runner361 versions - Latest release: 23 days ago - 5 dependent repositories - 1.65 thousand downloads last month - 364 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
689 versions - Latest release: 23 days ago - 11 dependent repositories - 6.32 thousand downloads last month - 363 stars on GitHub - 1 maintainer
arvados-python-client 2.7.2
Arvados client library689 versions - Latest release: 23 days ago - 11 dependent repositories - 6.32 thousand downloads last month - 363 stars on GitHub - 1 maintainer
arvados_fuse 2.7.2
Arvados FUSE driver503 versions - Latest release: 23 days ago - 2 dependent repositories - 1.54 thousand downloads last month - 363 stars on GitHub - 1 maintainer
arvados-pam 2.0.4
Arvados PAM module32 versions - Latest release: over 3 years ago - 99 downloads last month - 363 stars on GitHub - 1 maintainer
crunchstat_summary 2.7.2
Arvados crunchstat-summary reads crunch log files and summarizes resource usage20 versions - Latest release: 23 days ago - 302 downloads last month - 363 stars on GitHub - 1 maintainer
pysparkify 0.26.3
Spark based ETL12 versions - Latest release: 1 day ago - 511 downloads last month - 1 stars on GitHub - 2 maintainers
pycebes 0.10.2
Python client for Cebes HTTP server.2 versions - Latest release: about 6 years ago - 1 dependent repositories - 13 downloads last month - 2 stars on GitHub - 2 maintainers
Top 1.1% on pypi.org
38 versions - Latest release: about 2 months ago - 53 dependent packages - 287 dependent repositories - 92.2 thousand downloads last month - 3,206 stars on GitHub - 3 maintainers
vispy 0.14.2
Interactive visualization in Python38 versions - Latest release: about 2 months ago - 53 dependent packages - 287 dependent repositories - 92.2 thousand downloads last month - 3,206 stars on GitHub - 3 maintainers
Top 8.3% on pypi.org
88 versions - Latest release: 6 months ago - 1 dependent repositories - 1.74 thousand downloads last month - 755 stars on GitHub - 1 maintainer
visualpython 3.0.1 💰
Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook as an exten...88 versions - Latest release: 6 months ago - 1 dependent repositories - 1.74 thousand downloads last month - 755 stars on GitHub - 1 maintainer
Top 5.5% on pypi.org
22 versions - Latest release: 6 months ago - 4 dependent repositories - 1.09 thousand downloads last month - 755 stars on GitHub - 2 maintainers
jupyterlab-visualpython 3.0.1 💰
GUI-based Python code generator for Jupyter Lab as an extension22 versions - Latest release: 6 months ago - 4 dependent repositories - 1.09 thousand downloads last month - 755 stars on GitHub - 2 maintainers
exelog 0.0.1
Enabling meticulous logging for Spark Applications1 version - Latest release: over 2 years ago - 1 dependent repositories - 18 downloads last month - 5 stars on GitHub - 2 maintainers
Top 9.4% on pypi.org
32 versions - Latest release: over 1 year ago - 1 dependent repositories - 339 downloads last month - 1,441 stars on GitHub - 2 maintainers
pyoptimus 0.1.0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.32 versions - Latest release: over 1 year ago - 1 dependent repositories - 339 downloads last month - 1,441 stars on GitHub - 2 maintainers
pyspark-spy 1.0.2
Collect and aggregate on spark events for profitz. In 🐍 way!3 versions - Latest release: about 3 years ago - 1 dependent repositories - 218 downloads last month - 10 stars on GitHub - 2 maintainers
vulkn 19.0.10
The environmentally friendly petabyte scale Python eco-system built on Yandex ClickHouse10 versions - Latest release: about 4 years ago - 1 dependent repositories - 78 downloads last month - 43 stars on GitHub - 2 maintainers
Top 1.3% on pypi.org
101 versions - Latest release: 10 months ago - 15 dependent packages - 64 dependent repositories - 68.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-core 4.17.1
Core of vaex101 versions - Latest release: 10 months ago - 15 dependent packages - 64 dependent repositories - 68.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
23 versions - Latest release: 10 months ago - 4 dependent packages - 50 dependent repositories - 22.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-jupyter 0.8.2
Jupyter notebook and Jupyter lab support for vaex23 versions - Latest release: 10 months ago - 4 dependent packages - 50 dependent repositories - 22.1 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.5% on pypi.org
21 versions - Latest release: 10 months ago - 3 dependent packages - 51 dependent repositories - 20.8 thousand downloads last month - 8,171 stars on GitHub - 2 maintainers
vaex-server 0.9.0
Webserver and client for vaex for a remote dataset21 versions - Latest release: 10 months ago - 3 dependent packages - 51 dependent repositories - 20.8 thousand downloads last month - 8,171 stars on GitHub - 2 maintainers
bdtool 9.0.0
script of managing some bigdata technology stack11 versions - Latest release: about 2 years ago - 1 dependent repositories - 29 downloads last month - 0 stars on GitHub - 1 maintainer
legend-pydataobj 1.6.2
LEGEND Python Data Objects17 versions - Latest release: 1 day ago - 3 dependent packages - 1.98 thousand downloads last month - 0 stars on GitHub - 2 maintainers
mextractor 5.0.0 💰
mextractor can extract media metadata to YAML and read them16 versions - Latest release: about 1 month ago - 117 downloads last month - 4 stars on GitHub - 2 maintainers
zqy 1.0.5
belongs to zqy.4 versions - Latest release: over 6 years ago - 1 dependent repositories - 45 downloads last month - 2 maintainers
Top 1.5% on pypi.org
36 versions - Latest release: over 1 year ago - 4 dependent packages - 58 dependent repositories - 26.9 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-hdf5 0.14.1
hdf5 file support for vaex36 versions - Latest release: over 1 year ago - 4 dependent packages - 58 dependent repositories - 26.9 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
robotframework-dynamodbsqllibrary 0.3.1
An Amazon AWS DynamoDB big data testing library for Robot Framework with SQL-like DSL9 versions - Latest release: about 1 year ago - 2 dependent repositories - 1.01 thousand downloads last month - 4 stars on GitHub - 1 maintainer
bart-extract-ga 0.1.1
Extract event views to Google Analytics2 versions - Latest release: over 3 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
Top 4.3% on pypi.org
66 versions - Latest release: 3 days ago - 2 dependent packages - 4 dependent repositories - 19.9 thousand downloads last month - 42 stars on GitHub - 1 maintainer
resiliparse 0.14.7
A collection of robust and fast processing tools for parsing and analyzing (not only) web archive...66 versions - Latest release: 3 days ago - 2 dependent packages - 4 dependent repositories - 19.9 thousand downloads last month - 42 stars on GitHub - 1 maintainer
Top 3.3% on pypi.org
72 versions - Latest release: 3 days ago - 4 dependent packages - 5 dependent repositories - 199 thousand downloads last month - 42 stars on GitHub - 1 maintainer
fastwarc 0.14.7
A high-performance WARC parsing library for Python written in C++/Cython.72 versions - Latest release: 3 days ago - 4 dependent packages - 5 dependent repositories - 199 thousand downloads last month - 42 stars on GitHub - 1 maintainer
Top 0.3% on pypi.org
32 versions - Latest release: 7 months ago - 46 dependent packages - 1,066 dependent repositories - 7.1 million downloads last month - 2,755 stars on GitHub - 8 maintainers
avro 1.11.3
Avro is a serialization and RPC framework.32 versions - Latest release: 7 months ago - 46 dependent packages - 1,066 dependent repositories - 7.1 million downloads last month - 2,755 stars on GitHub - 8 maintainers
Top 0.4% on pypi.org
12 versions - Latest release: about 3 years ago - 34 dependent packages - 742 dependent repositories - 6.84 million downloads last month - 2,755 stars on GitHub - 7 maintainers
avro-python3 1.10.2
Avro is a serialization and RPC framework.12 versions - Latest release: about 3 years ago - 34 dependent packages - 742 dependent repositories - 6.84 million downloads last month - 2,755 stars on GitHub - 7 maintainers
bigdatasml 0.1.3
This package calculates average student performances3 versions - Latest release: over 2 years ago - 40 downloads last month - 2 maintainers
akudu 0.0.1
Asyncio Client for Apache Kudu Database Engine (Storage System).1 version - Latest release: about 1 year ago - 19 downloads last month - 1 stars on GitHub - 2 maintainers
Top 2.2% on pypi.org
300 versions - Latest release: 21 days ago - 69 dependent packages - 240 dependent repositories - 235 thousand downloads last month - 218 stars on GitHub - 2 maintainers
uproot 5.3.3
ROOT I/O in pure Python and NumPy.300 versions - Latest release: 21 days ago - 69 dependent packages - 240 dependent repositories - 235 thousand downloads last month - 218 stars on GitHub - 2 maintainers
Top 1.5% on pypi.org
24 versions - Latest release: over 1 year ago - 3 dependent packages - 54 dependent repositories - 21.5 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-viz 0.5.4
Visualization for vaex24 versions - Latest release: over 1 year ago - 3 dependent packages - 54 dependent repositories - 21.5 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
18 versions - Latest release: over 1 year ago - 2 dependent packages - 51 dependent repositories - 20.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-astro 0.9.3
Astronomy related transformations and FITS file support18 versions - Latest release: over 1 year ago - 2 dependent packages - 51 dependent repositories - 20.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
datacatalog-custom-entries-manager 0.1.2
A package to manage Google Cloud Data Catalog custom entries4 versions - Latest release: over 3 years ago - 1 dependent repositories - 57 downloads last month - 2 stars on GitHub - 2 maintainers
elixirnote 4.0.0a30
go from data to knowledge4 versions - Latest release: over 1 year ago - 19 downloads last month - 10 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
58 versions - Latest release: 10 months ago - 20 dependent packages - 90 dependent repositories - 21.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex 4.17.0
Out-of-Core DataFrames to visualize and explore big tabular datasets58 versions - Latest release: 10 months ago - 20 dependent packages - 90 dependent repositories - 21.6 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
34 versions - Latest release: 10 months ago - 2 dependent packages - 47 dependent repositories - 20.7 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-ml 0.18.3
Machine learning support for vaex34 versions - Latest release: 10 months ago - 2 dependent packages - 47 dependent repositories - 20.7 thousand downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
23 versions - Latest release: 3 months ago - 88 dependent packages - 11,437 dependent repositories - 2.36 million downloads last month - 964 stars on GitHub - 1 maintainer
cytoolz 0.12.3
Cython implementation of Toolz: High performance functional utilities23 versions - Latest release: 3 months ago - 88 dependent packages - 11,437 dependent repositories - 2.36 million downloads last month - 964 stars on GitHub - 1 maintainer
bart-simulator 0.2.1
Send event views to Google Analytics and Generator customers or products4 versions - Latest release: over 3 years ago - 1 dependent repositories - 47 downloads last month - 0 stars on GitHub - 1 maintainer
sysxtract 1.0.0
Extract logs based off events from sysmon. Comes as a package, cli and ui.1 version - Latest release: almost 4 years ago - 1 dependent repositories - 9 downloads last month - 3 stars on GitHub - 2 maintainers
Top 6.2% on pypi.org
3 versions - Latest release: about 3 years ago - 9 dependent repositories - 171 downloads last month - 8,171 stars on GitHub - 1 maintainer
vaex-graphql 0.2.0
GraphQL support for accessing vaex DataFrame3 versions - Latest release: about 3 years ago - 9 dependent repositories - 171 downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 2.7% on pypi.org
5 versions - Latest release: about 3 years ago - 13 dependent packages - 30 dependent repositories - 14.5 thousand downloads last month - 314 stars on GitHub - 1 maintainer
uproot3 3.14.4
ROOT I/O in pure Python and Numpy.5 versions - Latest release: about 3 years ago - 13 dependent packages - 30 dependent repositories - 14.5 thousand downloads last month - 314 stars on GitHub - 1 maintainer
zqybxqqwetest 1.0.1
belongs to zqy.1 version - Latest release: almost 7 years ago - 1 dependent repositories - 16 downloads last month - 2 maintainers
Top 4.4% on pypi.org
102 versions - Latest release: 9 months ago - 2 dependent repositories - 78.3 thousand downloads last month - 1,679 stars on GitHub - 6 maintainers
nflx-genie-client 3.6.17
Genie Python Client.102 versions - Latest release: 9 months ago - 2 dependent repositories - 78.3 thousand downloads last month - 1,679 stars on GitHub - 6 maintainers
foobar22 1.2.0 removed
This package is used for security research and demonstrations. It might contain dangerous code sn...1 version - Latest release: almost 2 years ago - 2 dependent repositories - 598 stars on GitHub
ustore 1.7.5
Python bindings for Unum's UStore.21 versions - Latest release: almost 2 years ago - 1 dependent repositories - 27 downloads last month - 478 stars on GitHub - 2 maintainers
ukv 0.12.1
Python bindings for Unum's UStore.23 versions - Latest release: about 1 year ago - 145 downloads last month - 478 stars on GitHub - 1 maintainer
vaex-distributed 0.3.0
Distributed dataset for vaex3 versions - Latest release: about 5 years ago - 1 dependent repositories - 18 downloads last month - 8,171 stars on GitHub - 2 maintainers
athenacli 1.6.8
CLI for Athena Database. With auto-completion and syntax highlighting.24 versions - Latest release: almost 2 years ago - 1 dependent repositories - 1.21 thousand downloads last month - 206 stars on GitHub - 2 maintainers
pybda 0.1.0
Analysis of big biological data sets for distributed HPC clusters.6 versions - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 9 stars on GitHub - 2 maintainers
gigapipe 0.1.22
Gigapipe Python Client13 versions - Latest release: over 1 year ago - 1 dependent repositories - 73 downloads last month - 2 maintainers
fdict 0.8.1
Easy out-of-core computing of recursive dict9 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 14.5 thousand downloads last month - 7 stars on GitHub - 2 maintainers
cybershuttle-sdk 0.0.1a4
Python SDK for Apache Airavata Cybershuttle3 versions - Latest release: 7 months ago - 26 downloads last month - 98 stars on GitHub - 2 maintainers
unisonctl 0.1
A frontend for unison allowing it to handle some read-only large datasets more efficently.1 version - Latest release: over 6 years ago - 1 dependent repositories - 10 downloads last month - 2 stars on GitHub - 2 maintainers
pysparkproxy 0.0.17
Seamlessly execute pyspark code on remote clusters9 versions - Latest release: over 5 years ago - 1 dependent repositories - 43 downloads last month - 4 stars on GitHub - 2 maintainers
risk-command-center 1.0.37
Risk Command Center, manage your risk easly.2 versions - Latest release: almost 2 years ago - 1 dependent repositories - 10 downloads last month - 11 stars on GitHub - 1 maintainer
airavata-django-portal-sdk 1.8.4
The Airavata Django Portal SDK is a library that makes it easier to develop Airavata Django Porta...33 versions - Latest release: 10 months ago - 3 dependent repositories - 185 downloads last month - 0 stars on GitHub - 3 maintainers
wendelin.core 0.2
Out-of-core NumPy arrays16 versions - Latest release: 9 months ago - 2 dependent repositories - 38 downloads last month - 2 maintainers
hdxcli 1.0rc51
Hydrolix command line utility to do CRUD operations on projects, tables, transforms and other res...32 versions - Latest release: about 1 month ago - 142 downloads last month - 6 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
7 versions - Latest release: over 1 year ago - 4 dependent packages - 65 dependent repositories - 657 thousand downloads last month - 158 stars on GitHub - 2 maintainers
pybloom-live 4.0.0
Bloom filter: A Probabilistic data structure7 versions - Latest release: over 1 year ago - 4 dependent packages - 65 dependent repositories - 657 thousand downloads last month - 158 stars on GitHub - 2 maintainers
Top 4.8% on pypi.org
30 versions - Latest release: over 3 years ago - 10 dependent packages - 11 dependent repositories - 20.7 thousand downloads last month - 199 stars on GitHub - 4 maintainers
uproot4 4.0.0
ROOT I/O in pure Python and NumPy.30 versions - Latest release: over 3 years ago - 10 dependent packages - 11 dependent repositories - 20.7 thousand downloads last month - 199 stars on GitHub - 4 maintainers
Top 9.3% on pypi.org
8 versions - Latest release: over 1 year ago - 2 dependent repositories - 1.56 thousand downloads last month - 77 stars on GitHub - 1 maintainer
anovos 1.1.0
An Open Source tool for Feature Engineering in Machine Learning8 versions - Latest release: over 1 year ago - 2 dependent repositories - 1.56 thousand downloads last month - 77 stars on GitHub - 1 maintainer
zqybxqtest 1.0.1
belongs to zqy.1 version - Latest release: almost 7 years ago - 1 dependent repositories - 3 downloads last month - 2 maintainers
Top 9.6% on pypi.org
7 versions - Latest release: over 4 years ago - 3 dependent repositories - 39 downloads last month - 8,171 stars on GitHub - 2 maintainers
vaex-ui 0.3.0
Graphical user interface for vaex based on Qt7 versions - Latest release: over 4 years ago - 3 dependent repositories - 39 downloads last month - 8,171 stars on GitHub - 2 maintainers
dtshare 1.0.3
Open financial data2 versions - Latest release: about 4 years ago - 1 dependent repositories - 28 downloads last month - 514 stars on GitHub - 2 maintainers
Top 8.1% on pypi.org
51 versions - Latest release: over 2 years ago - 2 dependent repositories - 158 downloads last month - 607 stars on GitHub - 2 maintainers
datafaker 0.7.6
A tool for generating batch test data or stream data.51 versions - Latest release: over 2 years ago - 2 dependent repositories - 158 downloads last month - 607 stars on GitHub - 2 maintainers
datacatalog-fileset-processor 0.1.5
A package to manage Google Cloud Data Catalog Fileset scripts6 versions - Latest release: over 3 years ago - 1 dependent repositories - 160 downloads last month - 3 stars on GitHub - 2 maintainers
spark-yarn-submit 1.0.0
library to handle spark job submit in a yarn cluster in different environment1 version - Latest release: about 7 years ago - 1 dependent repositories - 3 downloads last month - 3 stars on GitHub - 2 maintainers
grapetree 1.6.1
Web interface of GrapeTree, which is a program for phylogenetic analysis.16 versions - Latest release: over 4 years ago - 1 dependent repositories - 109 downloads last month - 74 stars on GitHub - 6 maintainers
datacatalog-tag-manager 2.2.0
A package to manage Google Cloud Data Catalog tags, loading metadata from external sources15 versions - Latest release: over 3 years ago - 1 dependent repositories - 249 downloads last month - 17 stars on GitHub - 2 maintainers
lemuras 1.2.3
A small Python library to deal with big tables6 versions - Latest release: about 4 years ago - 1 dependent repositories - 12 downloads last month - 3 stars on GitHub - 2 maintainers
package-lab-vr-demo-success 99.99.99
disclaimer, This package is used for security research and demonstrations. It might contain dange...1 version - Latest release: almost 2 years ago - 1 dependent repositories - 4 downloads last month - 3,355 stars on GitHub - 1 maintainer
mister 0.0.2
Approachable map/reduce jobs2 versions - Latest release: over 5 years ago - 1 dependent repositories - 12 downloads last month - 0 stars on GitHub - 2 maintainers
dlopes7-avro 1.12.0
Avro is a serialization and RPC framework.1 version - Latest release: 6 months ago - 20 downloads last month - 2,706 stars on GitHub - 1 maintainer
vector-lake 0.0.4
S3 vector database for bigdata4 versions - Latest release: 9 months ago - 21 downloads last month - 19 stars on GitHub - 2 maintainers
django-mass-migration 0.2.9
Django app for long-running data migrations16 versions - Latest release: 4 months ago - 1.11 thousand downloads last month - 2 stars on GitHub - 2 maintainers
taospyudf 0.0.11
taos python udf9 versions - Latest release: 12 months ago - 589 downloads last month - 22,767 stars on GitHub - 1 maintainer
datacatalog-custom-model-manager 0.1.1
A package to load user-specified metadata models into Google Cloud Data Catalog2 versions - Latest release: over 3 years ago - 1 dependent repositories - 8 downloads last month - 6 stars on GitHub - 2 maintainers
datacatalog-util 0.11.6
A package to manage Google Cloud Data Catalog helper commands and scripts21 versions - Latest release: over 3 years ago - 1 dependent repositories - 172 downloads last month - 20 stars on GitHub - 2 maintainers
Top 8.1% on pypi.org
67 versions - Latest release: about 2 months ago - 2 dependent packages - 2 dependent repositories - 751 downloads last month - 314 stars on GitHub - 4 maintainers
km3io 1.1.0
"KM3NeT I/O library without ROOT"67 versions - Latest release: about 2 months ago - 2 dependent packages - 2 dependent repositories - 751 downloads last month - 314 stars on GitHub - 4 maintainers
databend 1.2.344
Databend Python Binding259 versions - Latest release: 2 months ago - 368 downloads last month - 7,150 stars on GitHub - 4 maintainers
elasticsearch-partition 2.0.0
A Python library for creating Elasticsearch partitioned indexes by date range5 versions - Latest release: about 5 years ago - 1 dependent repositories - 12 downloads last month - 4 stars on GitHub - 2 maintainers
dpark 0.5.0
Python clone of Spark, MapReduce like computing framework supporting iterative algorithms.19 versions - Latest release: almost 6 years ago - 1 dependent repositories - 28 downloads last month - 2,693 stars on GitHub - 4 maintainers
panel-vegafusion 0.0.3
Build interactive big data apps with Altair and Vega easily using Panel + VegaFusion.3 versions - Latest release: over 2 years ago - 1 dependent repositories - 24 downloads last month - 14 stars on GitHub - 2 maintainers
hivehoney 1.0.4
Client-less data retrieval from Hive.5 versions - Latest release: about 5 years ago - 1 dependent repositories - 18 downloads last month - 3 stars on GitHub - 2 maintainers
vaex-contrib 0.1.3
Community contributed modules to vaex3 versions - Latest release: over 1 year ago - 1 dependent repositories - 17 downloads last month - 8,171 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
83 versions - Latest release: almost 4 years ago - 8 dependent repositories - 8.75 thousand downloads last month - 1,439 stars on GitHub - 4 maintainers
optimuspyspark 2.2.32
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion wi...83 versions - Latest release: almost 4 years ago - 8 dependent repositories - 8.75 thousand downloads last month - 1,439 stars on GitHub - 4 maintainers
workflux 0.5.0
A platform-agnostic, cloud-ready framework for simplified deployment of the Common Workflow Langu...1 version - Latest release: almost 3 years ago - 1 dependent repositories - 11 downloads last month - 32 stars on GitHub - 2 maintainers
pytispark 1.0.1
TiSpark support for python8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 49 downloads last month - 878 stars on GitHub - 3 maintainers
heppyness 2.1.0
ROOT I/O in pure Python and NumPy.18 versions - Latest release: 22 days ago - 1 dependent repositories - 206 downloads last month - 314 stars on GitHub - 2 maintainers
airconditioner 0.9.2
Yaml based DAG configurator for airflow1 version - Latest release: over 4 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
3 versions - Latest release: over 2 years ago - 2 dependent repositories - 609 downloads last month - 37 stars on GitHub - 2 maintainers
timehash 1.2
Module to encode and decode timestamps to/from TimeHashes3 versions - Latest release: over 2 years ago - 2 dependent repositories - 609 downloads last month - 37 stars on GitHub - 2 maintainers
pscli 1.1.1
Unofficial Cisco Parstream Client with improved cli3 versions - Latest release: over 3 years ago - 1 dependent repositories - 14 downloads last month - 2 stars on GitHub - 2 maintainers
datacatalog-fileset-enricher 1.2.0
A package for enriching the content of a fileset Entry with Datacatalog Tags8 versions - Latest release: about 4 years ago - 1 dependent repositories - 147 downloads last month - 4 stars on GitHub - 2 maintainers
Top 5.0% on pypi.org
1 version - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 134 downloads last month - 661 stars on GitHub - 3 maintainers
bigartm 0.9.2
BigARTM: the state-of-the-art platform for topic modeling1 version - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 134 downloads last month - 661 stars on GitHub - 3 maintainers
bigartm10 0.10.1
BigARTM: the state-of-the-art platform for topic modeling1 version - Latest release: over 4 years ago - 1 dependent repositories - 92 downloads last month - 661 stars on GitHub - 4 maintainers
bigartm9 0.9.2 removed
BigARTM: the state-of-the-art platform for topic modeling1 version - Latest release: over 1 year ago - 18 downloads last month - 630 stars on GitHub - 1 maintainer
Related Keywords
python
65
machine-learning
23
data-science
19
spark
17
visualization
17
big-data
16
machinelearning
15
hdf5
14
dataframe
13
memory-mapped-file
13
pyarrow
13
tabular-data
13
pyspark
11
workflow
10
gcp
10
ruby
8
database
8
docker
8
cloud
8
bioinformatics
8
analysis
8
python3
8
pandas
8
numpy
7
java
7
aws
7
genomics
6
cluster
6
datacatalog
6
azure
6
hadoop
6
hive
6
data-governance
5
file-format
5
analytics
5
cython
5
hep
5
hep-ex
5
workflow-engine
5
scikit-hep
5
go
5
root-cern
5
cwl
5
root
5
sql
5
data-analysis
5
json
5
arvados
5
google-cloud
4
gcp-datacatalog
4
rust
4
perl
4
cli
4
csv-import
4
c
4
microservices
3
apache-spark
3
data-extraction
3
hdfs
3
csv
3
web
3
lazy
3
avro
3
serialization
3
rpc
3
data
3
dataquality
3
cplusplus
3
csharp
3
nosql
3
bigartm
3
c-plus-plus
3
python-api
3
regularizer
3
text-mining
3
dotnet
3
topic-modeling
3
php
3
hep-py
2
webarchive
2
warc
2
htmlparser
2
chrome-extension
2
workflows
2
airavata
2
cloudnative
2
gateways
2
sciencegateways
2
workfloworchestrator
2
chile
2
cloudera
2
data-engineer
2
data-engineering
2
data-warehouse
2
datamart
2
gdpr
2
hortonworks
2
huemul
2
huemul-bigdatagovernance
2
parquet
2