Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
conda-forge.org "data-science" keyword
Top 5.1% on conda-forge.org
212 versions - Latest release: over 1 year ago - 10 dependent packages - 26 dependent repositories - 11,242 stars on GitHub
dvc 2.34.2
Data Version Control or DVC is an open-source tool for data science and machine learning projects.212 versions - Latest release: over 1 year ago - 10 dependent packages - 26 dependent repositories - 11,242 stars on GitHub
great-expectations 0.15.32
Great Expectations helps teams save time and promote analytic integrity by offering a unique appr...144 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 8,121 stars on GitHub
Top 4.9% on conda-forge.org
133 versions - Latest release: over 1 year ago - 8 dependent packages - 41 dependent repositories - 11,520 stars on GitHub
prefect 2.6.7
Prefect is a workflow management system, designed for modern infrastructure and powered by the op...133 versions - Latest release: over 1 year ago - 8 dependent packages - 41 dependent repositories - 11,520 stars on GitHub
Top 7.6% on conda-forge.org
119 versions - Latest release: over 1 year ago - 60 dependent packages - 2 dependent repositories - 6,905 stars on GitHub
dagster 1.0.17
Dagster lets you define pipelines in terms of the data flow between reusable, logical components,...119 versions - Latest release: over 1 year ago - 60 dependent packages - 2 dependent repositories - 6,905 stars on GitHub
dagit 1.0.17
An orchestration platform for the development, production, and observation of data assets.118 versions - Latest release: over 1 year ago - 2 dependent repositories - 6,905 stars on GitHub
dagster-prometheus 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-papertrail 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-celery 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 2 dependent packages - 6,905 stars on GitHub
dagster-spark 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 1 dependent package - 6,905 stars on GitHub
dagster-ssh 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-pandas 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 4 dependent packages - 6,905 stars on GitHub
dagster-dask 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-postgres 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dtale 2.9.0
Visualizer for pandas data structures114 versions - Latest release: over 1 year ago - 2 dependent packages - 2 dependent repositories - 3,944 stars on GitHub
dagster-graphql 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6 dependent packages - 6,905 stars on GitHub
dagster-datadog 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-github 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-pagerduty 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-slack 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-twilio 1.0.17
An orchestration platform for the development, production, and observation of data assets.114 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-pyspark 1.0.17
An orchestration platform for the development, production, and observation of data assets.112 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagstermill 1.0.17
An orchestration platform for the development, production, and observation of data assets.112 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-gcp 1.0.17
An orchestration platform for the development, production, and observation of data assets.112 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-airflow 1.0.17
An orchestration platform for the development, production, and observation of data assets.110 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-k8s 1.0.17
An orchestration platform for the development, production, and observation of data assets.109 versions - Latest release: over 1 year ago - 1 dependent package - 6,905 stars on GitHub
dagster-snowflake 1.0.17
An orchestration platform for the development, production, and observation of data assets.108 versions - Latest release: over 1 year ago - 1 dependent package - 6,905 stars on GitHub
dagster-dbt 1.0.17
An orchestration platform for the development, production, and observation of data assets.106 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-shell 1.0.17
An orchestration platform for the development, production, and observation of data assets.103 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-ge 1.0.17
An orchestration platform for the development, production, and observation of data assets.103 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
Top 8.0% on conda-forge.org
99 versions - Latest release: over 1 year ago - 4 dependent packages - 27 dependent repositories - 2,792 stars on GitHub
geemap 0.17.2 💰
A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and folium.99 versions - Latest release: over 1 year ago - 4 dependent packages - 27 dependent repositories - 2,792 stars on GitHub
dagster-celery-docker 1.0.17
An orchestration platform for the development, production, and observation of data assets.98 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-celery-k8s 1.0.17
Dagster lets you define pipelines in terms of the data flow between reusable, logical components,...98 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
Top 8.6% on conda-forge.org
88 versions - Latest release: over 1 year ago - 8 dependent packages - 155 dependent repositories - 1,481 stars on GitHub
tiledb 2.12.2
TileDB is an efficient multi-dimensional array management system which introduces a novel on-disk...88 versions - Latest release: over 1 year ago - 8 dependent packages - 155 dependent repositories - 1,481 stars on GitHub
Top 2.5% on conda-forge.org
87 versions - Latest release: over 1 year ago - 37 dependent packages - 108 dependent repositories - 18,331 stars on GitHub
dash 2.7.0 💰
Data Apps & Dashboards for Python. No JavaScript Required.87 versions - Latest release: over 1 year ago - 37 dependent packages - 108 dependent repositories - 18,331 stars on GitHub
dagster-docker 1.0.17
An orchestration platform for the development, production, and observation of data assets.86 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-mysql 1.0.17
An orchestration platform for the development, production, and observation of data assets.82 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
awswrangler 2.17.0
An open-source Python package that extends the power of Pandas library to AWS connecting DataFram...81 versions - Latest release: over 1 year ago - 1 dependent repositories - 3,363 stars on GitHub
Top 1.0% on conda-forge.org
72 versions - Latest release: over 1 year ago - 306 dependent packages - 3,810 dependent repositories - 15,737 stars on GitHub
ipython 8.6.0 💰
IPython provides a rich architecture for interactive computing with a powerful interactive shell,...72 versions - Latest release: over 1 year ago - 306 dependent packages - 3,810 dependent repositories - 15,737 stars on GitHub
dvc-azure 2.20.4
Data Version Control or DVC is an open-source tool for data science and machine learning projects.71 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-s3 2.21.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.70 versions - Latest release: over 1 year ago - 1 dependent package - 12 dependent repositories - 11,242 stars on GitHub
dagster-mlflow 1.0.17
An orchestration platform for the development, production, and observation of data assets.70 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dvc-gs 2.20.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.69 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
Top 1.6% on conda-forge.org
68 versions - Latest release: over 1 year ago - 92 dependent packages - 174 dependent repositories - 25,557 stars on GitHub
spacy 3.4.3
spaCy is a library for advanced natural language processing in Python and Cython.68 versions - Latest release: over 1 year ago - 92 dependent packages - 174 dependent repositories - 25,557 stars on GitHub
dvc-oss 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-hdfs 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-gdrive 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: over 1 year ago - 1 dependent package - 2 dependent repositories - 11,242 stars on GitHub
dvc-ssh 2.20.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.67 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11,242 stars on GitHub
dvc-webhdfs 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.64 versions - Latest release: over 1 year ago - 1 dependent repositories - 11,242 stars on GitHub
Top 8.1% on conda-forge.org
63 versions - Latest release: over 1 year ago - 7 dependent packages - 5 dependent repositories - 6,558 stars on GitHub
featuretools 1.18.0
Featuretools is a framework to perform automated feature engineering. It excels at transforming t...63 versions - Latest release: over 1 year ago - 7 dependent packages - 5 dependent repositories - 6,558 stars on GitHub
Top 5.3% on conda-forge.org
61 versions - Latest release: over 1 year ago - 9 dependent packages - 32 dependent repositories - 7,012 stars on GitHub
catboost 1.1.1
General purpose gradient boosting on decision trees library with categorical features support out...61 versions - Latest release: over 1 year ago - 9 dependent packages - 32 dependent repositories - 7,012 stars on GitHub
ploomber 0.21.7
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️60 versions - Latest release: over 1 year ago - 2 dependent repositories - 3,017 stars on GitHub
kfp 1.8.14
Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflo...58 versions - Latest release: over 1 year ago - 4 dependent packages - 2 dependent repositories - 3,137 stars on GitHub
evalml-core 0.61.1
EvalML is an AutoML library written in python.58 versions - Latest release: over 1 year ago - 1 dependent package - 592 stars on GitHub
metaflow 2.7.14
:rocket: Build and manage real-life data science projects with ease!58 versions - Latest release: over 1 year ago - 1 dependent repositories - 6,497 stars on GitHub
lifelines 0.27.4 💰
Survival analysis in Python58 versions - Latest release: over 1 year ago - 4 dependent packages - 9 dependent repositories - 2,055 stars on GitHub
evalml 0.61.1
EvalML is an AutoML library written in python.58 versions - Latest release: over 1 year ago - 592 stars on GitHub
dagster-msteams 1.0.17
An orchestration platform for the development, production, and observation of data assets.56 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
dagster-fivetran 1.0.17
An orchestration platform for the development, production, and observation of data assets.56 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
Top 0.1% on conda-forge.org
56 versions - Latest release: over 1 year ago - 1,663 dependent packages - 11,366 dependent repositories - 37,320 stars on GitHub
pandas 1.5.1 💰
Flexible and powerful data analysis / manipulation library for Python, providing labeled data str...56 versions - Latest release: over 1 year ago - 1,663 dependent packages - 11,366 dependent repositories - 37,320 stars on GitHub
dagster-aws 1.0.17
An orchestration platform for the development, production, and observation of data assets.55 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
vaex-core 4.14.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...54 versions - Latest release: over 1 year ago - 10 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
Top 8.8% on conda-forge.org
54 versions - Latest release: over 1 year ago - 15 dependent packages - 2 dependent repositories - 4,009 stars on GitHub
orange3 3.33.0 💰
Open source data visualization and data analysis for novice and expert. Interactive workflows wit...54 versions - Latest release: over 1 year ago - 15 dependent packages - 2 dependent repositories - 4,009 stars on GitHub
leafmap 0.12.1 💰
A Python package for geospatial analysis and interactive mapping in a Jupyter environment53 versions - Latest release: over 1 year ago - 4 dependent packages - 16 dependent repositories - 1,543 stars on GitHub
r-catboost 1.1.1
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking,...49 versions - Latest release: over 1 year ago - 7,013 stars on GitHub
Top 5.8% on conda-forge.org
48 versions - Latest release: over 1 year ago - 8 dependent packages - 86 dependent repositories - 5,671 stars on GitHub
wandb 0.13.5
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the C...48 versions - Latest release: over 1 year ago - 8 dependent packages - 86 dependent repositories - 5,671 stars on GitHub
dagster-airbyte 1.0.17
An orchestration platform for the development, production, and observation of data assets.47 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
nipype 1.8.5
Workflows and interfaces for neuroimaging packages45 versions - Latest release: over 1 year ago - 1 dependent package - 5 dependent repositories - 661 stars on GitHub
woodwork 0.20.0
Woodwork is a Python library that provides robust methods for managing and communicating data typ...44 versions - Latest release: over 1 year ago - 5 dependent packages - 112 stars on GitHub
dagster-cron 0.11.15
An orchestration platform for the development, production, and observation of data assets.44 versions - Latest release: almost 3 years ago - 6,905 stars on GitHub
koalas 1.8.2
Koalas: pandas API on Apache Spark42 versions - Latest release: over 2 years ago - 1 dependent package - 3,256 stars on GitHub
dagster-pandera 1.0.17
An orchestration platform for the development, production, and observation of data assets.34 versions - Latest release: over 1 year ago - 6,905 stars on GitHub
rubicon-ml 0.4.0
rubicon-ml is a machine learning solution designed to help standardize the model development life...33 versions - Latest release: over 1 year ago - 2 dependent repositories - 99 stars on GitHub
dash-table 5.0.0 💰
OBSOLETE: now part of https://github.com/plotly/dash31 versions - Latest release: over 2 years ago - 12 dependent packages - 29 dependent repositories - 422 stars on GitHub
flaml 1.0.14
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.31 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 2,337 stars on GitHub
modin 0.17.0
Modin: Scale your Pandas workflows by changing a single line of code31 versions - Latest release: over 1 year ago - 3 dependent packages - 2 dependent repositories - 8,468 stars on GitHub
h2o-py 3.38.0.2
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gra...31 versions - Latest release: over 1 year ago - 6,189 stars on GitHub
Top 0.1% on conda-forge.org
31 versions - Latest release: over 1 year ago - 647 dependent packages - 5,166 dependent repositories - 53,452 stars on GitHub
scikit-learn 1.1.3 💰
scikit-learn: machine learning in Python31 versions - Latest release: over 1 year ago - 647 dependent packages - 5,166 dependent repositories - 53,452 stars on GitHub
Top 0.8% on conda-forge.org
30 versions - Latest release: over 1 year ago - 699 dependent packages - 10,108 dependent repositories - 18,105 stars on GitHub
matplotlib 3.6.2 💰
matplotlib is a python 2D plotting library which produces publication quality figures in a variet...30 versions - Latest release: over 1 year ago - 699 dependent packages - 10,108 dependent repositories - 18,105 stars on GitHub
klib 1.0.6 💰
Easy to use Python library of customized functions for cleaning and analyzing data.30 versions - Latest release: over 1 year ago - 356 stars on GitHub
Top 0.9% on conda-forge.org
29 versions - Latest release: over 1 year ago - 1,213 dependent packages - 2,191 dependent repositories - 18,105 stars on GitHub
matplotlib-base 3.6.2 💰
matplotlib is a python 2D plotting library which produces publication quality figures in a variet...29 versions - Latest release: over 1 year ago - 1,213 dependent packages - 2,191 dependent repositories - 18,105 stars on GitHub
mpl_sample_data 3.4.3 💰
matplotlib is a python 2D plotting library which produces publication quality figures in a variet...29 versions - Latest release: over 2 years ago - 17,056 stars on GitHub
Top 10.0% on conda-forge.org
27 versions - Latest release: over 1 year ago - 3 dependent packages - 4 dependent repositories - 6,847 stars on GitHub
pyod 1.0.6 💰
A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)27 versions - Latest release: over 1 year ago - 3 dependent packages - 4 dependent repositories - 6,847 stars on GitHub
turbodbc 4.5.5
Turbodbc is a Python module to access relational databases via the Open Database Connectivity (OD...27 versions - Latest release: almost 2 years ago - 2 dependent packages - 2 dependent repositories - 566 stars on GitHub
Top 8.2% on conda-forge.org
26 versions - Latest release: over 1 year ago - 9 dependent packages - 50 dependent repositories - 1,314 stars on GitHub
torchmetrics 0.10.3
Torchmetrics is a metrics API created for easy metric development and usage in both PyTorch and [...26 versions - Latest release: over 1 year ago - 9 dependent packages - 50 dependent repositories - 1,314 stars on GitHub
redshift_connector 2.0.908
redshift_connector is the Amazon Redshift connector for Python. Easy integration with pandas and ...26 versions - Latest release: almost 2 years ago - 2 dependent packages - 178 stars on GitHub
stumpy 1.11.1
STUMPY is a powerful and scalable Python library that computes something called a matrix profile,...25 versions - Latest release: about 2 years ago - 3 dependent packages - 1 dependent repositories - 2,618 stars on GitHub
dvc-webdav 2.19.0
Data Version Control or DVC is an open-source tool for data science and machine learning projects.24 versions - Latest release: over 1 year ago - 11,242 stars on GitHub
aimodelshare 0.0.144
23 versions - Latest release: over 1 year ago - 36 stars on GitHub
Top 5.0% on conda-forge.org
23 versions - Latest release: over 1 year ago - 6 dependent packages - 69 dependent repositories - 10,373 stars on GitHub
pandas-profiling 3.4.0
Create HTML profiling reports from pandas DataFrame objects23 versions - Latest release: over 1 year ago - 6 dependent packages - 69 dependent repositories - 10,373 stars on GitHub
vaex-hdf5 0.13.0
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of ...23 versions - Latest release: over 1 year ago - 2 dependent packages - 1 dependent repositories - 7,837 stars on GitHub
pytorch-forecasting 0.10.2
PyTorch Forecasting is a timeseries forecasting package for PyTorch build on PyTorch Lightning. I...23 versions - Latest release: almost 2 years ago - 1 dependent repositories - 2,683 stars on GitHub
modin-dask 0.17.0
Modin: Scale your Pandas workflows by changing a single line of code22 versions - Latest release: over 1 year ago - 5 dependent packages - 1 dependent repositories - 8,468 stars on GitHub
modin-ray 0.17.0
Modin: Scale your Pandas workflows by changing a single line of code22 versions - Latest release: over 1 year ago - 4 dependent packages - 8,468 stars on GitHub
modin-core 0.17.0
Modin: Scale your Pandas workflows by changing a single line of code22 versions - Latest release: over 1 year ago - 9 dependent packages - 1 dependent repositories - 8,468 stars on GitHub
Top 1.1% on conda-forge.org
22 versions - Latest release: over 1 year ago - 29 dependent packages - 327 dependent repositories - 57,664 stars on GitHub
keras 2.10.0
Deep Learning for humans22 versions - Latest release: over 1 year ago - 29 dependent packages - 327 dependent repositories - 57,664 stars on GitHub
mage-ai 0.7.5
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and t...22 versions - Latest release: over 1 year ago - 3,631 stars on GitHub
modin-all 0.17.0
Modin: Scale your Pandas workflows by changing a single line of code22 versions - Latest release: over 1 year ago - 8,468 stars on GitHub
cdsdashboards-singleuser 0.6.3
JupyterHub extension for ContainDS Dashboards22 versions - Latest release: over 1 year ago - 6 dependent repositories - 183 stars on GitHub
cdsdashboards 0.6.3
JupyterHub extension for ContainDS Dashboards22 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 183 stars on GitHub
rubrix 0.18.0
Rubrix is a **production-ready Python framework for exploring, annotating, and managing data** in...21 versions - Latest release: over 1 year ago - 1 dependent package - 1,710 stars on GitHub
Related Keywords
python
265
machine-learning
148
mlops
88
data-engineering
75
workflow
70
analytics
70
etl
67
orchestration
65
data-integration
64
data-pipelines
64
scheduler
63
data-orchestrator
63
metadata
62
workflow-automation
62
dagster
61
hacktoberfest
44
data-analysis
40
pandas
36
visualization
35
deep-learning
35
data-visualization
29
ai
27
r
27
automl
27
reproducibility
24
scikit-learn
24
dataframe
24
hyperparameter-optimization
22
data
22
pytorch
21
developer-tools
21
model-selection
21
jupyter
20
statistics
20
collaboration
20
git
20
data-version-control
19
distributed
19
time-series
18
automation
18
spark
17
tensorflow
16
feature-engineering
15
data-mining
15
jupyter-notebook
15
random-forest
15
machinelearning
15
java
13
natural-language-processing
13
optimization
12
reinforcement-learning
12
matplotlib
12
nlp
11
automated-machine-learning
11
tabular-data
11
python3
11
parallel
11
hyperparameter-search
11
r-package
11
pipeline
11
forecasting
10
ray
10
rllib
10
gradient-boosting
10
ml
10
serving
10
bigdata
10
pyarrow
10
deployment
10
memory-mapped-file
10
hdf5
10
exploratory-data-analysis
10
sql
9
llm-serving
9
datascience
9
classification
8
notebook
8
workflows
8
numpy
8
sdk
8
flyte-tasks
8
extensible
8
pypi
8
rstats
8
flyte
8
time-series-analysis
8
big-data
8
plotly
7
alzheimers
7
nia
7
parameter-tuning
7
u01ag066833
7
regression
7
plotly-dash
7
anomaly-detection
7
artificial-intelligence
7
alzheimer
7
aiml
7
ag066833
7
adsp
7