Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
conda-forge.org "data-science" keyword
mapie 0.5.0
A scikit-learn-compatible module for estimating prediction intervals.8 versions - Latest release: over 1 year ago - 1 dependent repositories - 689 stars on GitHub
Top 4.7% on conda-forge.org
15 versions - Latest release: about 2 years ago - 8 dependent packages - 118 dependent repositories - 6,276 stars on GitHub
imbalanced-learn 0.9.1
imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in...15 versions - Latest release: about 2 years ago - 8 dependent packages - 118 dependent repositories - 6,276 stars on GitHub
Top 4.9% on conda-forge.org
133 versions - Latest release: over 1 year ago - 8 dependent packages - 41 dependent repositories - 11,520 stars on GitHub
prefect 2.6.7
Prefect is a workflow management system, designed for modern infrastructure and powered by the op...133 versions - Latest release: over 1 year ago - 8 dependent packages - 41 dependent repositories - 11,520 stars on GitHub
chartify 3.0.3
Python library that makes it easy for data scientists to create charts.10 versions - Latest release: over 3 years ago - 3,297 stars on GitHub
snorkel 0.9.9
Snorkel is a system for programmatically building and managing training datasets to rapidly and f...10 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 5,445 stars on GitHub
rubicon-ml 0.4.0
rubicon-ml is a machine learning solution designed to help standardize the model development life...33 versions - Latest release: over 1 year ago - 2 dependent repositories - 99 stars on GitHub
miceforest 5.6.2
Multiple Imputation iteratively 'fills in' missing values in a dataset by modeling each variable ...8 versions - Latest release: almost 2 years ago - 229 stars on GitHub
pdpipe 0.3.2
Ever written a preprocessing pipeline for pandas dataframes and had trouble serializing it for la...20 versions - Latest release: over 1 year ago - 703 stars on GitHub
r-targets 0.14.0
Function-oriented Make-like declarative workflows for R18 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 845 stars on GitHub
mljar-mercury 0.5.1
Build Web Apps in Jupyter Notebook with Python only3 versions - Latest release: over 2 years ago - 2,513 stars on GitHub
Top 7.3% on conda-forge.org
21 versions - Latest release: over 1 year ago - 4 dependent packages - 6 dependent repositories - 28,849 stars on GitHub
ray-tune 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...21 versions - Latest release: over 1 year ago - 4 dependent packages - 6 dependent repositories - 28,849 stars on GitHub
jupyterlab_templates 0.3.2
Jupyter notebook templates5 versions - Latest release: over 1 year ago - 314 stars on GitHub
mljar-supervised 0.11.3
The mljar-supervised is an Automated Machine Learning Python package that works with tabular data...5 versions - Latest release: almost 2 years ago - 2,520 stars on GitHub
Top 10.0% on conda-forge.org
10 versions - Latest release: about 2 years ago - 6 dependent packages - 27 dependent repositories - 1,551 stars on GitHub
doit 0.36.0 💰
`doit` is a task management & automation tool. `doit` comes from the idea of bringing the power ...10 versions - Latest release: about 2 years ago - 6 dependent packages - 27 dependent repositories - 1,551 stars on GitHub
mage-ai 0.7.5
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and t...22 versions - Latest release: over 1 year ago - 3,631 stars on GitHub
great-expectations 0.15.32
Great Expectations helps teams save time and promote analytic integrity by offering a unique appr...144 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 8,121 stars on GitHub
superset 2.0.0
Apache Superset is a Data Visualization and Data Exploration Platform12 versions - Latest release: almost 2 years ago - 58,575 stars on GitHub
traceml 1.0.0
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for P...1 version - Latest release: almost 2 years ago - 463 stars on GitHub
Top 0.1% on conda-forge.org
56 versions - Latest release: over 1 year ago - 1,663 dependent packages - 11,366 dependent repositories - 37,320 stars on GitHub
pandas 1.5.1 💰
Flexible and powerful data analysis / manipulation library for Python, providing labeled data str...56 versions - Latest release: over 1 year ago - 1,663 dependent packages - 11,366 dependent repositories - 37,320 stars on GitHub
apache-airflow-providers-samba 4.0.0
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows7 versions - Latest release: almost 2 years ago - 33,057 stars on GitHub
Top 1.6% on conda-forge.org
68 versions - Latest release: over 1 year ago - 92 dependent packages - 174 dependent repositories - 25,557 stars on GitHub
spacy 3.4.3
spaCy is a library for advanced natural language processing in Python and Cython.68 versions - Latest release: over 1 year ago - 92 dependent packages - 174 dependent repositories - 25,557 stars on GitHub
Top 8.0% on conda-forge.org
99 versions - Latest release: over 1 year ago - 4 dependent packages - 27 dependent repositories - 2,792 stars on GitHub
geemap 0.17.2 💰
A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and folium.99 versions - Latest release: over 1 year ago - 4 dependent packages - 27 dependent repositories - 2,792 stars on GitHub
python-crfsuite 0.9.8
A python binding for crfsuite7 versions - Latest release: about 2 years ago - 8 dependent packages - 2 dependent repositories - 749 stars on GitHub
bookstore 2.5.1
bookstore provides tooling and workflow recommendations for storing, scheduling, and publishing n...2 versions - Latest release: over 4 years ago - 187 stars on GitHub
sf-hamilton 1.11.0
A scalable general purpose micro-framework for defining dataflows. You can use it to build datafr...12 versions - Latest release: over 1 year ago - 688 stars on GitHub
lux-api 0.5.1
Automatically visualize your pandas dataframe via a single print! 📊 💡6 versions - Latest release: over 2 years ago - 1 dependent package - 2 dependent repositories - 4,488 stars on GitHub
lux 0.5.1
Automatically visualize your pandas dataframe via a single print! 📊 💡1 version - Latest release: over 1 year ago - 4,485 stars on GitHub
Top 8.8% on conda-forge.org
54 versions - Latest release: over 1 year ago - 15 dependent packages - 2 dependent repositories - 4,009 stars on GitHub
orange3 3.33.0 💰
Open source data visualization and data analysis for novice and expert. Interactive workflows wit...54 versions - Latest release: over 1 year ago - 15 dependent packages - 2 dependent repositories - 4,009 stars on GitHub
Top 8.6% on conda-forge.org
88 versions - Latest release: over 1 year ago - 8 dependent packages - 155 dependent repositories - 1,481 stars on GitHub
tiledb 2.12.2
TileDB is an efficient multi-dimensional array management system which introduces a novel on-disk...88 versions - Latest release: over 1 year ago - 8 dependent packages - 155 dependent repositories - 1,481 stars on GitHub
r-loon 1.4.0
A Toolkit for Interactive Statistical Data Visualization1 version - Latest release: over 1 year ago - 45 stars on GitHub
thepipe 1.3.8
A simplistic, general purpose pipeline framework.4 versions - Latest release: over 1 year ago - 1 dependent package - 13 stars on GitHub
vaex-core 4.14.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...54 versions - Latest release: over 1 year ago - 10 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
vaex-viz 0.5.4
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...16 versions - Latest release: over 1 year ago - 3 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
vaex-ml 0.18.0
Wrappers for various machine learning libraries to make them integrate into vaex.16 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 7,837 stars on GitHub
flytekit 1.1.0
Flytekit Python is the Python Library for easily authoring, testing, deploying, and interacting w...4 versions - Latest release: almost 2 years ago - 8 dependent packages - 123 stars on GitHub
flytekitplugins-sqlalchemy 1.2.4
SQLAlchemy plugin for Flytekit: `flytekitplugins-sqlalchemy` PyPI: [https://pypi.org/project/fly...8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-modin 1.2.4
Modin plugin for Flytekit: `flytekitplugins-modin` PyPI: [https://pypi.org/project/flytekitplugi...9 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-athena 1.2.4
Athena plugin for Flytekit: `flytekitplugins-athena` PyPI: [https://pypi.org/project/flytekitplu...8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-awsbatch 1.2.4
AWS Batch plugin for Flytekit: `flytekitplugins-awsbatch` PyPI: [https://pypi.org/project/flytek...9 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-data-fsspec 1.2.4
`fsspec` powered data-plugins for Flytekit: `flytekitplugins-data-fsspec` PyPI: [https://pypi.or...8 versions - Latest release: over 1 year ago - 123 stars on GitHub
flytekitplugins-spark 1.0.5
Spark 3 plugin for Flytekit: `flytekitplugins-spark` PyPI: [https://pypi.org/project/flytekitplu...1 version - Latest release: almost 2 years ago - 123 stars on GitHub
bowtie-py 0.11.0
Bowtie is a library for writing dashboards in Python. No need to know web frameworks or JavaScrip...7 versions - Latest release: over 5 years ago - 756 stars on GitHub
artificial-adversary 1.1.1
🗣️ Tool to generate adversarial text examples and test machine learning models against them2 versions - Latest release: almost 4 years ago - 377 stars on GitHub
dash-table 5.0.0 💰
OBSOLETE: now part of https://github.com/plotly/dash31 versions - Latest release: over 2 years ago - 12 dependent packages - 29 dependent repositories - 422 stars on GitHub
dvc-hdfs 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-ssh 2.20.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-oss 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
Top 6.1% on conda-forge.org
15 versions - Latest release: over 1 year ago - 11 dependent packages - 4 dependent repositories - 28,849 stars on GitHub
ray-default 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...15 versions - Latest release: over 1 year ago - 11 dependent packages - 4 dependent repositories - 28,849 stars on GitHub
ray-serve 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...21 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 28,849 stars on GitHub
ray-dashboard 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...21 versions - Latest release: over 1 year ago - 1 dependent package - 2 dependent repositories - 26,719 stars on GitHub
Top 5.1% on conda-forge.org
21 versions - Latest release: over 1 year ago - 21 dependent packages - 5 dependent repositories - 28,849 stars on GitHub
ray-core 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...21 versions - Latest release: over 1 year ago - 21 dependent packages - 5 dependent repositories - 28,849 stars on GitHub
ray-k8s 2.0.1
Ray is a fast and simple framework for building and running distributed applications. It is split...19 versions - Latest release: over 1 year ago - 1 dependent package - 26,341 stars on GitHub
pixiedust 1.1.19
Python Helper library for Jupyter Notebooks3 versions - Latest release: over 3 years ago - 1 dependent repositories - 1,029 stars on GitHub
matbench 0.6
Matbench: Benchmarks for materials science property prediction2 versions - Latest release: almost 2 years ago - 57 stars on GitHub
cfanalytics 0.1.4
Downloading, analyzing and visualizing CrossFit data1 version - Latest release: over 1 year ago - 27 stars on GitHub
Top 6.4% on conda-forge.org
13 versions - Latest release: over 1 year ago - 6 dependent packages - 35 dependent repositories - 3,875 stars on GitHub
tensorflow-probability 0.18.0
TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in Tenso...13 versions - Latest release: over 1 year ago - 6 dependent packages - 35 dependent repositories - 3,875 stars on GitHub
redshift_connector 2.0.908
redshift_connector is the Amazon Redshift connector for Python. Easy integration with pandas and ...26 versions - Latest release: almost 2 years ago - 2 dependent packages - 178 stars on GitHub
castoredc_api 0.1.4
Python Wrapper for Castor EDC API7 versions - Latest release: almost 2 years ago - 2 stars on GitHub
plotly-resampler 0.8.1
Visualize large time series data with plotly.py14 versions - Latest release: almost 2 years ago - 1 dependent repositories - 675 stars on GitHub
Top 8.1% on conda-forge.org
63 versions - Latest release: over 1 year ago - 7 dependent packages - 5 dependent repositories - 6,558 stars on GitHub
featuretools 1.18.0
Featuretools is a framework to perform automated feature engineering. It excels at transforming t...63 versions - Latest release: over 1 year ago - 7 dependent packages - 5 dependent repositories - 6,558 stars on GitHub
anndata 0.8.0
AnnData provides a scalable way of keeping track of data and learned annotations. It was initiall...10 versions - Latest release: about 2 years ago - 10 dependent packages - 17 dependent repositories - 358 stars on GitHub
rubrix 0.18.0
Rubrix is a **production-ready Python framework for exploring, annotating, and managing data** in...21 versions - Latest release: over 1 year ago - 1 dependent package - 1,710 stars on GitHub
medaprep 0.1.1
medaprep is a data preparation and feature engineering toolkit for geospatial applications.1 version - Latest release: almost 2 years ago - 1 stars on GitHub
causalnex 0.11.0
A Python library that helps data scientists to infer causation rather than observing correlation.1 version - Latest release: over 1 year ago - 1,811 stars on GitHub
tpot-skrebate 0.11.7
Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that ...1 version - Latest release: about 3 years ago - 1 dependent package - 8,986 stars on GitHub
pymks 0.4.1
PyMKS is an open source, pythonic implementation of the methodologies developed under the aegis o...4 versions - Latest release: about 2 years ago - 1 dependent repositories - 92 stars on GitHub
scikit-time 0.1
A unified framework for machine learning with time series1 version - Latest release: over 4 years ago - 39 stars on GitHub
apache-airflow-providers-microsoft-azure 4.0.0
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows16 versions - Latest release: almost 2 years ago - 33,967 stars on GitHub
evaml-core 0.12.2
An open source python library for automated feature engineering1 version - Latest release: almost 4 years ago - 6,558 stars on GitHub
piso 0.9.0
Pandas Interval Set Operations: providing methods for set operations, analytics, lookups and join...9 versions - Latest release: about 2 years ago - 16 stars on GitHub
Top 5.8% on conda-forge.org
48 versions - Latest release: over 1 year ago - 8 dependent packages - 86 dependent repositories - 5,671 stars on GitHub
wandb 0.13.5
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the C...48 versions - Latest release: over 1 year ago - 8 dependent packages - 86 dependent repositories - 5,671 stars on GitHub
pandas_schema 0.3.5
A validation library for Pandas data frames using user-friendly schemas1 version - Latest release: about 4 years ago - 1 dependent repositories - 180 stars on GitHub
vaex-astro 0.9.2
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...14 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
ukcensusapi 1.1.6
UK Census Data queries and downloads from python or R3 versions - Latest release: about 2 years ago - 1 dependent package - 27 stars on GitHub
vaex-jupyter 0.8.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...15 versions - Latest release: almost 2 years ago - 1 dependent package - 1 dependent repositories - 7,837 stars on GitHub
leafmap 0.12.1 💰
A Python package for geospatial analysis and interactive mapping in a Jupyter environment53 versions - Latest release: over 1 year ago - 4 dependent packages - 16 dependent repositories - 1,543 stars on GitHub
r-fastverse 0.3.0
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and...3 versions - Latest release: over 1 year ago - 175 stars on GitHub
r-dalex 2.4.2 💰
moDel Agnostic Language for Exploration and eXplanation18 versions - Latest release: almost 2 years ago - 1 dependent package - 1,173 stars on GitHub
Top 8.2% on conda-forge.org
26 versions - Latest release: over 1 year ago - 9 dependent packages - 50 dependent repositories - 1,314 stars on GitHub
torchmetrics 0.10.3
Torchmetrics is a metrics API created for easy metric development and usage in both PyTorch and [...26 versions - Latest release: over 1 year ago - 9 dependent packages - 50 dependent repositories - 1,314 stars on GitHub
graspy 0.2
A graph, or network, provides a mathematically intuitive representation of data with some sort of...1 version - Latest release: about 4 years ago - 1 dependent package - 291 stars on GitHub
raiutils 0.2.0
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment us...1 version - Latest release: almost 2 years ago - 1 dependent package - 741 stars on GitHub
scikit-data 0.1.3
This library offers functions to manipulate, clean and visualize data in a easy way.3 versions - Latest release: over 1 year ago - 19 stars on GitHub
dowhy 0.7.1
DoWhy is a Python library for causal inference that supports explicit modeling and testing of cau...7 versions - Latest release: about 2 years ago - 1 dependent package - 2 dependent repositories - 5,750 stars on GitHub
Top 2.7% on conda-forge.org
18 versions - Latest release: about 2 years ago - 17 dependent packages - 105 dependent repositories - 14,085 stars on GitHub
gensim 4.2.0 💰
Gensim is a Python library for topic modelling, document indexing and similarity retrieval with l...18 versions - Latest release: about 2 years ago - 17 dependent packages - 105 dependent repositories - 14,085 stars on GitHub
Top 5.0% on conda-forge.org
23 versions - Latest release: over 1 year ago - 6 dependent packages - 69 dependent repositories - 10,373 stars on GitHub
pandas-profiling 3.4.0
Create HTML profiling reports from pandas DataFrame objects23 versions - Latest release: over 1 year ago - 6 dependent packages - 69 dependent repositories - 10,373 stars on GitHub
vaex-server 0.8.1
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...12 versions - Latest release: over 2 years ago - 3 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
tidypandas 0.2.3
A grammar of data manipulation for pandas inspired by tidyverse3 versions - Latest release: over 1 year ago - 70 stars on GitHub
vaex-ui 0.3.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...5 versions - Latest release: over 4 years ago - 1 dependent package - 7,837 stars on GitHub
vaex-hdf5 0.13.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...23 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
nteract_on_jupyter 2.1.3 💰
📘 The interactive computing suite for you! ✨2 versions - Latest release: almost 5 years ago - 26 dependent repositories - 5,983 stars on GitHub
Top 5.3% on conda-forge.org
6 versions - Latest release: over 1 year ago - 4 dependent packages - 134 dependent repositories - 10,487 stars on GitHub
seaborn-base 0.12.1
Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface...6 versions - Latest release: over 1 year ago - 4 dependent packages - 134 dependent repositories - 10,487 stars on GitHub
r-loose.rock 1.1.0
An R :package: that contains a wide set of useful functions for data science and survival analysis2 versions - Latest release: about 3 years ago - 2 stars on GitHub
geomatics 0.10.1
A python tool for time series of multidimensional scientific data8 versions - Latest release: almost 4 years ago - 1 stars on GitHub
vaex-arrow 0.5.1
Arrow support for vaex (out of core dataframes)12 versions - Latest release: almost 4 years ago - 1 dependent package - 7,837 stars on GitHub
creme 0.6.1 💰
🌊 Online machine learning in Python7 versions - Latest release: over 2 years ago - 4,146 stars on GitHub
rubrix-server 0.18.0
Rubrix is a **production-ready Python framework for exploring, annotating, and managing data** in...15 versions - Latest release: over 1 year ago - 1,710 stars on GitHub
r-rio 0.5.29
A Swiss-Army Knife for Data I/O5 versions - Latest release: over 2 years ago - 6 dependent packages - 6 dependent repositories - 548 stars on GitHub
vaex-distributed 0.3.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...3 versions - Latest release: about 5 years ago - 1 dependent package - 7,837 stars on GitHub
mlxtend 0.21.0 💰
A library of Python tools and extensions for data science and machine learning. Contact =========...17 versions - Latest release: over 1 year ago - 2 dependent packages - 9 dependent repositories - 4,310 stars on GitHub
ploomber 0.21.7
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️60 versions - Latest release: over 1 year ago - 2 dependent repositories - 3,017 stars on GitHub
Related Keywords
python
265
machine-learning
148
mlops
88
data-engineering
75
analytics
70
workflow
70
etl
67
orchestration
65
data-pipelines
64
data-integration
64
scheduler
63
data-orchestrator
63
workflow-automation
62
metadata
62
dagster
61
hacktoberfest
44
data-analysis
40
pandas
36
deep-learning
35
visualization
35
data-visualization
29
automl
27
r
27
ai
27
reproducibility
24
dataframe
24
scikit-learn
24
data
22
hyperparameter-optimization
22
pytorch
21
developer-tools
21
model-selection
21
statistics
20
git
20
jupyter
20
collaboration
20
data-version-control
19
distributed
19
automation
18
time-series
18
spark
17
tensorflow
16
machinelearning
15
jupyter-notebook
15
random-forest
15
data-mining
15
feature-engineering
15
natural-language-processing
13
java
13
matplotlib
12
reinforcement-learning
12
optimization
12
hyperparameter-search
11
pipeline
11
python3
11
r-package
11
parallel
11
automated-machine-learning
11
tabular-data
11
nlp
11
ray
10
rllib
10
serving
10
gradient-boosting
10
exploratory-data-analysis
10
deployment
10
pyarrow
10
memory-mapped-file
10
hdf5
10
bigdata
10
forecasting
10
ml
10
llm-serving
9
datascience
9
sql
9
numpy
8
big-data
8
time-series-analysis
8
extensible
8
notebook
8
flyte
8
flyte-tasks
8
pypi
8
sdk
8
workflows
8
classification
8
rstats
8
adsp
7
ag066833
7
plotly
7
plotly-dash
7
anomaly-detection
7
aiml
7
alzheimer
7
artificial-intelligence
7
alzheimers
7
u01ag066833
7
parameter-tuning
7
nia
7
regression
7