Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
pypi.org "evaluation-metrics" keyword
Top 8.9% on pypi.org
13 versions - Latest release: about 4 hours ago - 3 dependent repositories - 1.85 thousand downloads last month - 31 stars on GitHub - 1 maintainer
codebleu 0.6.1
Unofficial CodeBLEU implementation that supports Linux, MacOS and Windows available on PyPI.13 versions - Latest release: about 4 hours ago - 3 dependent repositories - 1.85 thousand downloads last month - 31 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
33 versions - Latest release: 2 months ago - 4 dependent packages - 45 dependent repositories - 23 thousand downloads last month - 401 stars on GitHub - 2 maintainers
unbabel-comet 2.2.2
High-quality Machine Translation Evaluation33 versions - Latest release: 2 months ago - 4 dependent packages - 45 dependent repositories - 23 thousand downloads last month - 401 stars on GitHub - 2 maintainers
top-pr 0.2.1
TopP&R: Robust Support Estimation Approach for Evaluating Fidelity and Diversity in Generative Mo...3 versions - Latest release: 8 months ago - 1 dependent package - 9.97 thousand downloads last month - 103 stars on GitHub - 1 maintainer
tonic-validate 6.0.0
RAG evaluation metrics.24 versions - Latest release: 6 days ago - 2 dependent packages - 2.3 thousand downloads last month - 208 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
9 versions - Latest release: almost 2 years ago - 6 dependent packages - 15 dependent repositories - 5.77 thousand downloads last month - 689 stars on GitHub - 1 maintainer
rliable 1.0.8
rliable: Reliable evaluation on reinforcement learning and machine learning benchmarks.9 versions - Latest release: almost 2 years ago - 6 dependent packages - 15 dependent repositories - 5.77 thousand downloads last month - 689 stars on GitHub - 1 maintainer
Top 5.3% on pypi.org
45 versions - Latest release: 6 months ago - 4 dependent packages - 7 dependent repositories - 13.2 thousand downloads last month - 348 stars on GitHub - 1 maintainer
ranx 0.3.19
ranx: A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion45 versions - Latest release: 6 months ago - 4 dependent packages - 7 dependent repositories - 13.2 thousand downloads last month - 348 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
102 versions - Latest release: about 5 years ago - 1 dependent package - 4 dependent repositories - 15.5 thousand downloads last month - 476 stars on GitHub - 1 maintainer
pynlpl 1.2.9
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contai...102 versions - Latest release: about 5 years ago - 1 dependent package - 4 dependent repositories - 15.5 thousand downloads last month - 476 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
1 version - Latest release: about 4 years ago - 2 dependent packages - 17 dependent repositories - 1.37 thousand downloads last month - 234 stars on GitHub - 1 maintainer
prdc 0.2
Compute precision, recall, density, and coverage metrics for two sets of vectors.1 version - Latest release: about 4 years ago - 2 dependent packages - 17 dependent repositories - 1.37 thousand downloads last month - 234 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
20 versions - Latest release: 3 months ago - 8 dependent packages - 1 dependent repositories - 985 downloads last month - 52 stars on GitHub - 1 maintainer
permetrics 2.0.0
PerMetrics: A Framework of Performance Metrics for Machine Learning Models20 versions - Latest release: 3 months ago - 8 dependent packages - 1 dependent repositories - 985 downloads last month - 52 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
34 versions - Latest release: 9 months ago - 2 dependent packages - 3 dependent repositories - 25.5 thousand downloads last month - 681 stars on GitHub - 1 maintainer
octis 1.13.1
OCTIS: a library for Optimizing and Comparing Topic Models.34 versions - Latest release: 9 months ago - 2 dependent packages - 3 dependent repositories - 25.5 thousand downloads last month - 681 stars on GitHub - 1 maintainer
athina 1.2.17
Python SDK to configure and run evaluations for your LLM-based application48 versions - Latest release: about 11 hours ago - 1.16 thousand downloads last month - 135 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
20 versions - Latest release: 10 days ago - 43 dependent packages - 1,125 dependent repositories - 528 thousand downloads last month - 540 stars on GitHub - 1 maintainer
jiwer 3.0.4
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)20 versions - Latest release: 10 days ago - 43 dependent packages - 1,125 dependent repositories - 528 thousand downloads last month - 540 stars on GitHub - 1 maintainer
coreference-eval 0.0.2
Common metrics and evaluation tools for coreference chains (jsonline format)2 versions - Latest release: over 1 year ago - 1 dependent repositories - 85 downloads last month - 4 stars on GitHub - 1 maintainer
guardrails-ai-unbabel-comet 2.2.1
High-quality Machine Translation Evaluation1 version - Latest release: 5 months ago - 1 dependent package - 641 downloads last month - 401 stars on GitHub - 1 maintainer
falcon-evaluate 0.1.6
Falcon Evaluate is an open-source Python library designed to simplify the process of evaluating a...17 versions - Latest release: 7 months ago - 1 dependent package - 281 downloads last month - 6 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
208 versions - Latest release: 1 day ago - 7 dependent packages - 1 dependent repositories - 34.5 thousand downloads last month - 1,635 stars on GitHub - 2 maintainers
deepeval 0.21.40
The open-source evaluation framework for LLMs.208 versions - Latest release: 1 day ago - 7 dependent packages - 1 dependent repositories - 34.5 thousand downloads last month - 1,635 stars on GitHub - 2 maintainers
agentops 0.1.10
Python SDK for developing AI agent evals and observability39 versions - Latest release: 4 days ago - 1 dependent package - 1 dependent repositories - 3.95 thousand downloads last month - 613 stars on GitHub - 2 maintainers
testllm 0.14.1
Deep eval provides evaluation platform to accelerate development of LLMs and Agents1 version - Latest release: 8 months ago - 32 downloads last month - 1,635 stars on GitHub - 1 maintainer
mini-judge 0.4.1
Simple implementation of LLM-As-Judge for pairwise evaluation of Q&A models6 versions - Latest release: 7 months ago - 52 downloads last month - 2 stars on GitHub - 1 maintainer
cd-fvd 0.1.0.dev1
FVD calculation in PyTorch with I3D or VideoMAE models2 versions - Latest release: 28 days ago - 231 downloads last month - 26 stars on GitHub - 1 maintainer
kolena 1.18.0
Client for Kolena's machine learning testing platform.65 versions - Latest release: 1 day ago - 1 dependent repositories - 7.63 thousand downloads last month - 38 stars on GitHub - 1 maintainer
kolena-client 1.18.0
Client for Kolena's machine learning testing platform.70 versions - Latest release: 1 day ago - 1.91 thousand downloads last month - 38 stars on GitHub - 1 maintainer
tvalmetrics 1.0.2
RAG evaluation metrics.6 versions - Latest release: 5 months ago - 1 dependent package - 78 downloads last month - 208 stars on GitHub - 1 maintainer
easy-lm-eval 0.1.2
A library for easy evaluation of language models3 versions - Latest release: 2 months ago - 20 downloads last month - 3 stars on GitHub - 1 maintainer
v-stream 0.1.2
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models8 versions - Latest release: 4 months ago - 47 downloads last month - 9 stars on GitHub - 1 maintainer
continuous-eval 0.3.7
Open-Source Evaluation for GenAI Application Pipelines.19 versions - Latest release: 20 days ago - 1.24 thousand downloads last month - 311 stars on GitHub - 1 maintainer
metric-eval 1.0.2
a python package for evaluating evaluation metrics3 versions - Latest release: 6 months ago - 14 downloads last month - 6 stars on GitHub - 1 maintainer
colortransferlib 1.0.0
This library provides color and tyle transfer algorithms which were published in scientific paper...4 versions - Latest release: 6 months ago - 14 downloads last month - 4 stars on GitHub - 1 maintainer
rke-score 0.0.7
Compute Renyi Kernel Entropy scores (RKE-MC and RRKE) for two sets of vectors.5 versions - Latest release: 6 months ago - 34 downloads last month - 9 stars on GitHub - 1 maintainer
tvallogging 1.0.0
Logging for Tonic Validate4 versions - Latest release: 5 months ago - 47 downloads last month - 198 stars on GitHub - 1 maintainer
lighteval 0.3.0
A lightweight and configurable evaluation package8 versions - Latest release: about 2 months ago - 2.07 thousand downloads last month - 299 stars on GitHub - 3 maintainers
cleval 0.1.1
cleval2 versions - Latest release: 7 months ago - 59 downloads last month - 184 stars on GitHub - 1 maintainer
skloverlay 1.2.0
SKLearn Classification Interface5 versions - Latest release: 8 months ago - 37 downloads last month - 0 stars on GitHub - 1 maintainer
gleu 1.1.0
GLEU: evaluation metric for grammatical error correction2 versions - Latest release: 10 months ago - 14 downloads last month - 2 stars on GitHub - 1 maintainer
deepevals 0.2.0
Eval1 version - Latest release: 9 months ago - 12 downloads last month - 1,635 stars on GitHub - 1 maintainer
llmevals 0.1.0
Eval2 versions - Latest release: 9 months ago - 30 downloads last month - 1,635 stars on GitHub - 1 maintainer
boost-loss 0.5.5 💰
Utilities for easy use of custom losses in CatBoost, LightGBM, XGBoost21 versions - Latest release: 4 months ago - 129 downloads last month - 6 stars on GitHub - 1 maintainer
faster-coco-eval 1.5.4
Faster interpretation of the original COCOEval12 versions - Latest release: 21 days ago - 1 dependent package - 1 dependent repositories - 23.1 thousand downloads last month - 31 stars on GitHub - 1 maintainer
nlptutti 0.0.2
nlp measurement package10 versions - Latest release: over 1 year ago - 1 dependent repositories - 965 downloads last month - 45 stars on GitHub - 1 maintainer
daze 0.1.1
Better multi-class confusion matrix plots for Scikit-Learn, incorporating per-class and overall e...3 versions - Latest release: about 3 years ago - 1 dependent repositories - 29 downloads last month - 3 stars on GitHub - 1 maintainer
clayrs 0.5.1
Complexly represent contents, build recommender systems, evaluate them. All in one place!12 versions - Latest release: 11 months ago - 52 downloads last month - 32 stars on GitHub - 1 maintainer
ward-metrics 0.9.5
Tools for event-based evaluation for activity recognition problems.6 versions - Latest release: about 6 years ago - 2 dependent repositories - 15 downloads last month - 6 stars on GitHub - 1 maintainer
skflex 1.0.2
skflex provides a suite of flexible utility functions for use with the sklearn library4 versions - Latest release: over 2 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
regressormetricgraphplot 0.0.3
A simple package for comparing different Regression Models and Plotting with their most common ev...3 versions - Latest release: over 3 years ago - 1 dependent repositories - 72 downloads last month - 6 stars on GitHub - 1 maintainer
rank-eval 0.1.3
rank_eval: A Blazing Fast Python Library for Ranking Evaluation and Comparison5 versions - Latest release: over 2 years ago - 1 dependent repositories - 236 downloads last month - 352 stars on GitHub - 1 maintainer
rankeval 0.8.2
Tool for the analysis and evaluation of Learning to Rank models based on ensembles of regression ...8 versions - Latest release: over 4 years ago - 1 dependent repositories - 395 downloads last month - 87 stars on GitHub - 1 maintainer
quica 0.2.5
Quick Inter Coder Agreement in Python6 versions - Latest release: over 3 years ago - 1 dependent repositories - 60 downloads last month - 23 stars on GitHub - 1 maintainer
pytolemaic 0.15.4
Package for ML model analysis56 versions - Latest release: almost 2 years ago - 1 dependent repositories - 422 downloads last month - 11 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
4 versions - Latest release: 8 months ago - 339 dependent repositories - 10.3 thousand downloads last month - 244 stars on GitHub - 1 maintainer
pyrouge 0.1.3
A Python wrapper for the ROUGE summarization evaluation package.4 versions - Latest release: 8 months ago - 339 dependent repositories - 10.3 thousand downloads last month - 244 stars on GitHub - 1 maintainer
probability-calibration 0.0.1
Utilities to calibrate model outcome probability and evaluate calibration.1 version - Latest release: about 3 years ago - 1 dependent repositories - 11 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.9% on pypi.org
2 versions - Latest release: almost 3 years ago - 1 dependent package - 14 dependent repositories - 71.8 thousand downloads last month - 21 stars on GitHub - 1 maintainer
nf1 0.0.4
NF1: Normalized F1 score for community evaluation against ground truth2 versions - Latest release: almost 3 years ago - 1 dependent package - 14 dependent repositories - 71.8 thousand downloads last month - 21 stars on GitHub - 1 maintainer
Top 3.8% on pypi.org
6 versions - Latest release: over 3 years ago - 2 dependent packages - 20 dependent repositories - 4.42 thousand downloads last month - 130 stars on GitHub - 1 maintainer
nervaluate 0.1.8 💰
NER evaluation done right6 versions - Latest release: over 3 years ago - 2 dependent packages - 20 dependent repositories - 4.42 thousand downloads last month - 130 stars on GitHub - 1 maintainer
nereval 0.2.5
Evaluation script for named entity recognition systems based on F1 score.3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 469 downloads last month - 66 stars on GitHub - 1 maintainer
guap 0.1.4
Open-source evaluation metric for linking Machine Learning model outputs with Business outcomes4 versions - Latest release: almost 3 years ago - 1 dependent repositories - 50 downloads last month - 23 stars on GitHub - 1 maintainer
fightin-words 1.0.5
An implementation of Monroe et. al's Fightin' Words Analysis2 versions - Latest release: about 5 years ago - 2 dependent repositories - 23 downloads last month - 10 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
7 versions - Latest release: almost 2 years ago - 3 dependent packages - 9 dependent repositories - 28 thousand downloads last month - 123 stars on GitHub - 1 maintainer
fast-bss-eval 0.1.4
Package for fast computation of BSS Eval metrics for source separation7 versions - Latest release: almost 2 years ago - 3 dependent packages - 9 dependent repositories - 28 thousand downloads last month - 123 stars on GitHub - 1 maintainer
evalify 0.1.4
Evaluate your face or voice verification models literally in seconds.5 versions - Latest release: 11 months ago - 1 dependent repositories - 16 downloads last month - 19 stars on GitHub - 1 maintainer
debobo 0.1.6
Package for evaluating object detection models6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 36 downloads last month - 1 stars on GitHub - 1 maintainer
ctc-score 0.1.3
CTC: A Unified Framework for Evaluating Natural Language Generation6 versions - Latest release: almost 2 years ago - 1 dependent repositories - 130 downloads last month - 94 stars on GitHub - 1 maintainer
classeval 0.2.2 💰
Python package classeval17 versions - Latest release: 3 months ago - 2 dependent packages - 2 dependent repositories - 980 downloads last month - 7 stars on GitHub - 1 maintainer
bokbokbok 0.6.1
Custom Losses and Metrics for XGBoost, LightGBM, CatBoost7 versions - Latest release: almost 3 years ago - 1 dependent repositories - 105 downloads last month - 25 stars on GitHub - 1 maintainer
tutti-nlp 0.0.1 removed
nlp measurement package1 version - Latest release: over 1 year ago - 14 stars on GitHub
repsys-framework 0.4.1
Framework for developing and analyzing recommender systems.21 versions - Latest release: 9 months ago - 2 dependent repositories - 154 downloads last month - 34 stars on GitHub - 1 maintainer
ir-metrics 0.1.6
The most common information retrieval (IR) metrics15 versions - Latest release: over 3 years ago - 5 dependent repositories - 710 downloads last month - 3 stars on GitHub - 1 maintainer
gan-evaluator 1.15
GAN Evaluator for IS and FID9 versions - Latest release: about 1 year ago - 16 downloads last month - 10 stars on GitHub - 1 maintainer
hmeasure 0.1.6
H-Measure Classification Metric7 versions - Latest release: about 3 years ago - 94 downloads last month - 6 stars on GitHub - 1 maintainer
testclayrs 0.1.1 removed
Complexly represent contents, build recommender systems, evaluate them. All in one place!2 versions - Latest release: almost 2 years ago - 8 stars on GitHub
Related Keywords
evaluation
22
evaluation-framework
17
machine-learning
17
python
16
nlp
9
metrics
7
llm
7
llmops
7
llm-evaluation
6
deep-learning
5
large-language-models
5
natural-language-processing
5
llm-evaluation-framework
4
classification
4
llms
4
information-retrieval
4
llm-evaluation-metrics
4
rag
4
retrieval-augmented-generation
4
generative-adversarial-network
4
scikit-learn
4
regression
3
data-science
3
recall
3
precision
3
generative-model
3
pytorch
3
wer
3
evaluation-functions
3
artificial-intelligence
3
speech-to-text
3
ai
3
word-error-rate
3
evaluate
3
plot
3
recommender-systems
3
classification-report
2
nlp-library
2
python3
2
coefficient-of-determination
2
transcribe
2
linguistics
2
style-transfer
2
text-evaluation
2
text-digitisation
2
diversity
2
named-entity-recognition
2
mlops
2
evaluate-models
2
rmse
2
mae
2
testing
2
ML
2
Kolena
2
content-based-recommendation
2
accuracy
2
video-generation
2
graph-based-recommendation
2
frechet-inception-distance
2
recommender-system
2
test
2
character-error-rate
2
cer
2
aws
2
amazon
2
summarization
2
xgboost
2
sklearn
2
machine-translation
2
COMET
2
Unbabel
2
Evaluation
2
Machine Translation
2
lightgbm
2
diffusion-models
2
natural language processing
2
analysis-framework
2
custom-loss-functions
2
score-fusion
2
ranking-metrics
2
rank-fusion
2
information-retrieval-metrics
2
information-retrieval-evaluation
2
data-fusion
2
comparison
2
numba
2
metasearch
2
speech-recognition
2
ranking
2
speech-analysis
2
information retrieval
2
trec_eval
2
normalization
2
korean
2
computing-error-rates
2
correlation-coefficient
1
validation-curve
1
utilities
1
custom-loss
1
graph
1