Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

conda-forge.org "big-data" keyword

uproot4 0.1.2
ROOT I/O in pure Python and NumPy.
4 versions - Latest release: over 3 years ago - 2 dependent packages - 1 dependent repositories - 178 stars on GitHub
couchdb 3.2.2
Apache CouchDB™ lets you access your data where you need it. The Couch Replication Protocol is im...
1 version - Latest release: almost 2 years ago - 5,638 stars on GitHub
koalas 1.8.2
Koalas: pandas API on Apache Spark
42 versions - Latest release: over 2 years ago - 1 dependent package - 3,256 stars on GitHub
Top 5.3% on conda-forge.org
catboost 1.1.1
General purpose gradient boosting on decision trees library with categorical features support out...
61 versions - Latest release: over 1 year ago - 9 dependent packages - 32 dependent repositories - 7,012 stars on GitHub
r-catboost 1.1.1
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking,...
49 versions - Latest release: over 1 year ago - 7,013 stars on GitHub
uproot 4.3.7
uproot (originally μproot, for "micro-Python ROOT") is a reader and a writer of the ROOT file for...
90 versions - Latest release: over 1 year ago - 9 dependent packages - 32 dependent repositories - 189 stars on GitHub
uproot-base 4.3.7
uproot (originally μproot, for "micro-Python ROOT") is a reader and a writer of the ROOT file for...
88 versions - Latest release: over 1 year ago - 1 dependent package - 8 dependent repositories - 189 stars on GitHub
richdem 2.3.0
High-performance Terrain and Hydrology Analysis
2 versions - Latest release: about 2 years ago - 2 dependent packages - 14 dependent repositories - 201 stars on GitHub
nipype 1.8.5
Workflows and interfaces for neuroimaging packages
45 versions - Latest release: over 1 year ago - 1 dependent package - 5 dependent repositories - 661 stars on GitHub
r-h2o 3.38.0.1
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gra...
11 versions - Latest release: over 1 year ago - 6,189 stars on GitHub
h2o-py 3.38.0.2
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gra...
31 versions - Latest release: over 1 year ago - 6,189 stars on GitHub
apache-beam-with-azure 2.42.0
Apache Beam is an open source, unified model for defining both batch and streaming data-parallel ...
11 versions - Latest release: over 1 year ago - 6,627 stars on GitHub
eland 8.3.0
Eland is a Elasticsearch client Python package to analyse, explore and manipulate data that resid...
13 versions - Latest release: almost 2 years ago - 488 stars on GitHub
delta-sharing-python 0.5.2
An open protocol for secure data sharing
6 versions - Latest release: over 1 year ago - 539 stars on GitHub
ibmpairs 0.1.3
open source tools for interaction with IBM PAIRS:
5 versions - Latest release: about 3 years ago - 1 dependent repositories - 29 stars on GitHub
uproot3 3.14.4
ROOT I/O in pure Python and NumPy.
2 versions - Latest release: about 3 years ago - 5 dependent packages - 5 dependent repositories - 316 stars on GitHub
verticapy 0.11.0
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science pro...
17 versions - Latest release: over 1 year ago - 124 stars on GitHub
r-bigstatsr 1.5.12
R package for statistical tools with big matrices stored on disk.
4 versions - Latest release: over 1 year ago - 1 dependent repositories - 164 stars on GitHub
Top 2.2% on conda-forge.org
cython 0.29.32 💰
Cython is an optimising static compiler for both the Python programming language and the extended...
49 versions - Latest release: almost 2 years ago - 136 dependent packages - 1,295 dependent repositories - 7,766 stars on GitHub
pyfit-sne 1.2.1
Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)
2 versions - Latest release: almost 3 years ago - 550 stars on GitHub
fit-sne 1.2.1
Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)
3 versions - Latest release: about 4 years ago - 550 stars on GitHub
r-filearray 0.1.5
Out-of-memory Arrays in R
1 version - Latest release: over 1 year ago - 2 dependent packages - 15 stars on GitHub
awkward0 0.15.5
Manipulate arrays of complex data structures as easily as Numpy.
2 versions - Latest release: about 3 years ago - 4 dependent packages - 4 dependent repositories - 219 stars on GitHub
feast 0.26.0
Feature Store for Machine Learning
1 version - Latest release: over 1 year ago - 4,088 stars on GitHub
apache-beam-with-aws 2.42.0
Apache Beam is an open source, unified model for defining both batch and streaming data-parallel ...
11 versions - Latest release: over 1 year ago - 6,627 stars on GitHub
Top 9.8% on conda-forge.org
orc 1.8.0
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
23 versions - Latest release: over 1 year ago - 5 dependent packages - 51 dependent repositories - 595 stars on GitHub
datafusion 0.4.0
DataFusion is an extensible query execution framework, written in Rust, that uses Apache Arrow as...
1 version - Latest release: over 2 years ago - 2 dependent packages - 3,356 stars on GitHub
r-sparkr 3.3.1
Apache Spark - A unified analytics engine for large-scale data processing
9 versions - Latest release: over 1 year ago - 37,996 stars on GitHub
daal4py 2021.6.0
<strong>LEGAL NOTICE: Use of this software package is subject to the software license agreement (...
10 versions - Latest release: over 1 year ago - 4 dependent packages - 57 dependent repositories - 918 stars on GitHub
scikit-learn-intelex 2021.6.0
<strong>LEGAL NOTICE: Use of this software package is subject to the software license agreement (...
7 versions - Latest release: over 1 year ago - 2 dependent packages - 63 dependent repositories - 919 stars on GitHub
apache-beam-with-gcp 2.42.0
Apache Beam is an open source, unified model for defining both batch and streaming data-parallel ...
11 versions - Latest release: over 1 year ago - 2 dependent repositories - 6,627 stars on GitHub
Top 9.3% on conda-forge.org
apache-beam 2.42.0
Apache Beam is an open source, unified model for defining both batch and streaming data-parallel ...
34 versions - Latest release: over 1 year ago - 8 dependent packages - 1 dependent repositories - 6,627 stars on GitHub