Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "evaluation-metrics" keyword

Top 8.9% on pypi.org
codebleu 0.6.1
Unofficial CodeBLEU implementation that supports Linux, MacOS and Windows available on PyPI.
13 versions - Latest release: about 4 hours ago - 3 dependent repositories - 1.85 thousand downloads last month - 31 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
unbabel-comet 2.2.2
High-quality Machine Translation Evaluation
33 versions - Latest release: 2 months ago - 4 dependent packages - 45 dependent repositories - 23 thousand downloads last month - 401 stars on GitHub - 2 maintainers
top-pr 0.2.1
TopP&R: Robust Support Estimation Approach for Evaluating Fidelity and Diversity in Generative Mo...
3 versions - Latest release: 8 months ago - 1 dependent package - 9.97 thousand downloads last month - 103 stars on GitHub - 1 maintainer
tonic-validate 6.0.0
RAG evaluation metrics.
24 versions - Latest release: 6 days ago - 2 dependent packages - 2.3 thousand downloads last month - 208 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
rliable 1.0.8
rliable: Reliable evaluation on reinforcement learning and machine learning benchmarks.
9 versions - Latest release: almost 2 years ago - 6 dependent packages - 15 dependent repositories - 5.77 thousand downloads last month - 689 stars on GitHub - 1 maintainer
Top 5.3% on pypi.org
ranx 0.3.19
ranx: A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion
45 versions - Latest release: 6 months ago - 4 dependent packages - 7 dependent repositories - 13.2 thousand downloads last month - 348 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
pynlpl 1.2.9
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contai...
102 versions - Latest release: about 5 years ago - 1 dependent package - 4 dependent repositories - 15.5 thousand downloads last month - 476 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
prdc 0.2
Compute precision, recall, density, and coverage metrics for two sets of vectors.
1 version - Latest release: about 4 years ago - 2 dependent packages - 17 dependent repositories - 1.37 thousand downloads last month - 234 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
permetrics 2.0.0
PerMetrics: A Framework of Performance Metrics for Machine Learning Models
20 versions - Latest release: 3 months ago - 8 dependent packages - 1 dependent repositories - 985 downloads last month - 52 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
octis 1.13.1
OCTIS: a library for Optimizing and Comparing Topic Models.
34 versions - Latest release: 9 months ago - 2 dependent packages - 3 dependent repositories - 25.5 thousand downloads last month - 681 stars on GitHub - 1 maintainer
athina 1.2.17
Python SDK to configure and run evaluations for your LLM-based application
48 versions - Latest release: about 11 hours ago - 1.16 thousand downloads last month - 135 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
jiwer 3.0.4
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
20 versions - Latest release: 10 days ago - 43 dependent packages - 1,125 dependent repositories - 528 thousand downloads last month - 540 stars on GitHub - 1 maintainer
coreference-eval 0.0.2
Common metrics and evaluation tools for coreference chains (jsonline format)
2 versions - Latest release: over 1 year ago - 1 dependent repositories - 85 downloads last month - 4 stars on GitHub - 1 maintainer
guardrails-ai-unbabel-comet 2.2.1
High-quality Machine Translation Evaluation
1 version - Latest release: 5 months ago - 1 dependent package - 641 downloads last month - 401 stars on GitHub - 1 maintainer
falcon-evaluate 0.1.6
Falcon Evaluate is an open-source Python library designed to simplify the process of evaluating a...
17 versions - Latest release: 7 months ago - 1 dependent package - 281 downloads last month - 6 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
deepeval 0.21.40
The open-source evaluation framework for LLMs.
208 versions - Latest release: 1 day ago - 7 dependent packages - 1 dependent repositories - 34.5 thousand downloads last month - 1,635 stars on GitHub - 2 maintainers
agentops 0.1.10
Python SDK for developing AI agent evals and observability
39 versions - Latest release: 4 days ago - 1 dependent package - 1 dependent repositories - 3.95 thousand downloads last month - 613 stars on GitHub - 2 maintainers
testllm 0.14.1
Deep eval provides evaluation platform to accelerate development of LLMs and Agents
1 version - Latest release: 8 months ago - 32 downloads last month - 1,635 stars on GitHub - 1 maintainer
mini-judge 0.4.1
Simple implementation of LLM-As-Judge for pairwise evaluation of Q&A models
6 versions - Latest release: 7 months ago - 52 downloads last month - 2 stars on GitHub - 1 maintainer
cd-fvd 0.1.0.dev1
FVD calculation in PyTorch with I3D or VideoMAE models
2 versions - Latest release: 28 days ago - 231 downloads last month - 26 stars on GitHub - 1 maintainer
kolena 1.18.0
Client for Kolena's machine learning testing platform.
65 versions - Latest release: 1 day ago - 1 dependent repositories - 7.63 thousand downloads last month - 38 stars on GitHub - 1 maintainer
kolena-client 1.18.0
Client for Kolena's machine learning testing platform.
70 versions - Latest release: 1 day ago - 1.91 thousand downloads last month - 38 stars on GitHub - 1 maintainer
tvalmetrics 1.0.2
RAG evaluation metrics.
6 versions - Latest release: 5 months ago - 1 dependent package - 78 downloads last month - 208 stars on GitHub - 1 maintainer
easy-lm-eval 0.1.2
A library for easy evaluation of language models
3 versions - Latest release: 2 months ago - 20 downloads last month - 3 stars on GitHub - 1 maintainer
v-stream 0.1.2
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models
8 versions - Latest release: 4 months ago - 47 downloads last month - 9 stars on GitHub - 1 maintainer
continuous-eval 0.3.7
Open-Source Evaluation for GenAI Application Pipelines.
19 versions - Latest release: 20 days ago - 1.24 thousand downloads last month - 311 stars on GitHub - 1 maintainer
metric-eval 1.0.2
a python package for evaluating evaluation metrics
3 versions - Latest release: 6 months ago - 14 downloads last month - 6 stars on GitHub - 1 maintainer
colortransferlib 1.0.0
This library provides color and tyle transfer algorithms which were published in scientific paper...
4 versions - Latest release: 6 months ago - 14 downloads last month - 4 stars on GitHub - 1 maintainer
rke-score 0.0.7
Compute Renyi Kernel Entropy scores (RKE-MC and RRKE) for two sets of vectors.
5 versions - Latest release: 6 months ago - 34 downloads last month - 9 stars on GitHub - 1 maintainer
tvallogging 1.0.0
Logging for Tonic Validate
4 versions - Latest release: 5 months ago - 47 downloads last month - 198 stars on GitHub - 1 maintainer
lighteval 0.3.0
A lightweight and configurable evaluation package
8 versions - Latest release: about 2 months ago - 2.07 thousand downloads last month - 299 stars on GitHub - 3 maintainers
cleval 0.1.1
cleval
2 versions - Latest release: 7 months ago - 59 downloads last month - 184 stars on GitHub - 1 maintainer
skloverlay 1.2.0
SKLearn Classification Interface
5 versions - Latest release: 8 months ago - 37 downloads last month - 0 stars on GitHub - 1 maintainer
gleu 1.1.0
GLEU: evaluation metric for grammatical error correction
2 versions - Latest release: 10 months ago - 14 downloads last month - 2 stars on GitHub - 1 maintainer
deepevals 0.2.0
Eval
1 version - Latest release: 9 months ago - 12 downloads last month - 1,635 stars on GitHub - 1 maintainer
llmevals 0.1.0
Eval
2 versions - Latest release: 9 months ago - 30 downloads last month - 1,635 stars on GitHub - 1 maintainer
boost-loss 0.5.5 💰
Utilities for easy use of custom losses in CatBoost, LightGBM, XGBoost
21 versions - Latest release: 4 months ago - 129 downloads last month - 6 stars on GitHub - 1 maintainer
faster-coco-eval 1.5.4
Faster interpretation of the original COCOEval
12 versions - Latest release: 21 days ago - 1 dependent package - 1 dependent repositories - 23.1 thousand downloads last month - 31 stars on GitHub - 1 maintainer
nlptutti 0.0.2
nlp measurement package
10 versions - Latest release: over 1 year ago - 1 dependent repositories - 965 downloads last month - 45 stars on GitHub - 1 maintainer
daze 0.1.1
Better multi-class confusion matrix plots for Scikit-Learn, incorporating per-class and overall e...
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 29 downloads last month - 3 stars on GitHub - 1 maintainer
clayrs 0.5.1
Complexly represent contents, build recommender systems, evaluate them. All in one place!
12 versions - Latest release: 11 months ago - 52 downloads last month - 32 stars on GitHub - 1 maintainer
ward-metrics 0.9.5
Tools for event-based evaluation for activity recognition problems.
6 versions - Latest release: about 6 years ago - 2 dependent repositories - 15 downloads last month - 6 stars on GitHub - 1 maintainer
skflex 1.0.2
skflex provides a suite of flexible utility functions for use with the sklearn library
4 versions - Latest release: over 2 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
regressormetricgraphplot 0.0.3
A simple package for comparing different Regression Models and Plotting with their most common ev...
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 72 downloads last month - 6 stars on GitHub - 1 maintainer
rank-eval 0.1.3
rank_eval: A Blazing Fast Python Library for Ranking Evaluation and Comparison
5 versions - Latest release: over 2 years ago - 1 dependent repositories - 236 downloads last month - 352 stars on GitHub - 1 maintainer
rankeval 0.8.2
Tool for the analysis and evaluation of Learning to Rank models based on ensembles of regression ...
8 versions - Latest release: over 4 years ago - 1 dependent repositories - 395 downloads last month - 87 stars on GitHub - 1 maintainer
quica 0.2.5
Quick Inter Coder Agreement in Python
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 60 downloads last month - 23 stars on GitHub - 1 maintainer
pytolemaic 0.15.4
Package for ML model analysis
56 versions - Latest release: almost 2 years ago - 1 dependent repositories - 422 downloads last month - 11 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
pyrouge 0.1.3
A Python wrapper for the ROUGE summarization evaluation package.
4 versions - Latest release: 8 months ago - 339 dependent repositories - 10.3 thousand downloads last month - 244 stars on GitHub - 1 maintainer
probability-calibration 0.0.1
Utilities to calibrate model outcome probability and evaluate calibration.
1 version - Latest release: about 3 years ago - 1 dependent repositories - 11 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.9% on pypi.org
nf1 0.0.4
NF1: Normalized F1 score for community evaluation against ground truth
2 versions - Latest release: almost 3 years ago - 1 dependent package - 14 dependent repositories - 71.8 thousand downloads last month - 21 stars on GitHub - 1 maintainer
Top 3.8% on pypi.org
nervaluate 0.1.8 💰
NER evaluation done right
6 versions - Latest release: over 3 years ago - 2 dependent packages - 20 dependent repositories - 4.42 thousand downloads last month - 130 stars on GitHub - 1 maintainer
nereval 0.2.5
Evaluation script for named entity recognition systems based on F1 score.
3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 469 downloads last month - 66 stars on GitHub - 1 maintainer
guap 0.1.4
Open-source evaluation metric for linking Machine Learning model outputs with Business outcomes
4 versions - Latest release: almost 3 years ago - 1 dependent repositories - 50 downloads last month - 23 stars on GitHub - 1 maintainer
fightin-words 1.0.5
An implementation of Monroe et. al's Fightin' Words Analysis
2 versions - Latest release: about 5 years ago - 2 dependent repositories - 23 downloads last month - 10 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
fast-bss-eval 0.1.4
Package for fast computation of BSS Eval metrics for source separation
7 versions - Latest release: almost 2 years ago - 3 dependent packages - 9 dependent repositories - 28 thousand downloads last month - 123 stars on GitHub - 1 maintainer
evalify 0.1.4
Evaluate your face or voice verification models literally in seconds.
5 versions - Latest release: 11 months ago - 1 dependent repositories - 16 downloads last month - 19 stars on GitHub - 1 maintainer
debobo 0.1.6
Package for evaluating object detection models
6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 36 downloads last month - 1 stars on GitHub - 1 maintainer
ctc-score 0.1.3
CTC: A Unified Framework for Evaluating Natural Language Generation
6 versions - Latest release: almost 2 years ago - 1 dependent repositories - 130 downloads last month - 94 stars on GitHub - 1 maintainer
classeval 0.2.2 💰
Python package classeval
17 versions - Latest release: 3 months ago - 2 dependent packages - 2 dependent repositories - 980 downloads last month - 7 stars on GitHub - 1 maintainer
bokbokbok 0.6.1
Custom Losses and Metrics for XGBoost, LightGBM, CatBoost
7 versions - Latest release: almost 3 years ago - 1 dependent repositories - 105 downloads last month - 25 stars on GitHub - 1 maintainer
tutti-nlp 0.0.1 removed
nlp measurement package
1 version - Latest release: over 1 year ago - 14 stars on GitHub
repsys-framework 0.4.1
Framework for developing and analyzing recommender systems.
21 versions - Latest release: 9 months ago - 2 dependent repositories - 154 downloads last month - 34 stars on GitHub - 1 maintainer
ir-metrics 0.1.6
The most common information retrieval (IR) metrics
15 versions - Latest release: over 3 years ago - 5 dependent repositories - 710 downloads last month - 3 stars on GitHub - 1 maintainer
gan-evaluator 1.15
GAN Evaluator for IS and FID
9 versions - Latest release: about 1 year ago - 16 downloads last month - 10 stars on GitHub - 1 maintainer
hmeasure 0.1.6
H-Measure Classification Metric
7 versions - Latest release: about 3 years ago - 94 downloads last month - 6 stars on GitHub - 1 maintainer
testclayrs 0.1.1 removed
Complexly represent contents, build recommender systems, evaluate them. All in one place!
2 versions - Latest release: almost 2 years ago - 8 stars on GitHub
Related Keywords
evaluation 22 evaluation-framework 17 machine-learning 17 python 16 nlp 9 metrics 7 llm 7 llmops 7 llm-evaluation 6 deep-learning 5 large-language-models 5 natural-language-processing 5 llm-evaluation-framework 4 classification 4 llms 4 information-retrieval 4 llm-evaluation-metrics 4 rag 4 retrieval-augmented-generation 4 generative-adversarial-network 4 scikit-learn 4 regression 3 data-science 3 recall 3 precision 3 generative-model 3 pytorch 3 wer 3 evaluation-functions 3 artificial-intelligence 3 speech-to-text 3 ai 3 word-error-rate 3 evaluate 3 plot 3 recommender-systems 3 classification-report 2 nlp-library 2 python3 2 coefficient-of-determination 2 transcribe 2 linguistics 2 style-transfer 2 text-evaluation 2 text-digitisation 2 diversity 2 named-entity-recognition 2 mlops 2 evaluate-models 2 rmse 2 mae 2 testing 2 ML 2 Kolena 2 content-based-recommendation 2 accuracy 2 video-generation 2 graph-based-recommendation 2 frechet-inception-distance 2 recommender-system 2 test 2 character-error-rate 2 cer 2 aws 2 amazon 2 summarization 2 xgboost 2 sklearn 2 machine-translation 2 COMET 2 Unbabel 2 Evaluation 2 Machine Translation 2 lightgbm 2 diffusion-models 2 natural language processing 2 analysis-framework 2 custom-loss-functions 2 score-fusion 2 ranking-metrics 2 rank-fusion 2 information-retrieval-metrics 2 information-retrieval-evaluation 2 data-fusion 2 comparison 2 numba 2 metasearch 2 speech-recognition 2 ranking 2 speech-analysis 2 information retrieval 2 trec_eval 2 normalization 2 korean 2 computing-error-rates 2 correlation-coefficient 1 validation-curve 1 utilities 1 custom-loss 1 graph 1