Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "evaluation" keyword

ml3m 0.0.20
Evaluting your LLM performance
20 versions - Latest release: 8 months ago - 163 downloads last month - 37,327 stars on GitHub - 1 maintainer
evaluators 1.0.3
Various scene understanding and perception evaluation metrics.
3 versions - Latest release: about 1 year ago - 65 downloads last month - 28,811 stars on GitHub - 1 maintainer
fdatasets 1.12.1 removed
HuggingFace/Datasets is an open library of NLP datasets.
1 version - Latest release: about 2 years ago - 14,671 stars on GitHub
Top 8.8% on pypi.org
coconut-develop 3.1.0.post0.dev11 💰
Simple, elegant, Pythonic functional programming.
655 versions - Latest release: 20 days ago - 2 dependent repositories - 3.58 thousand downloads last month - 3,951 stars on GitHub - 2 maintainers
Top 3.6% on pypi.org
coconut 3.1.0 💰
Simple, elegant, Pythonic functional programming.
41 versions - Latest release: 2 months ago - 3 dependent packages - 22 dependent repositories - 2.81 thousand downloads last month - 3,951 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
evo 1.28.0
Python package for the evaluation of odometry and SLAM
100 versions - Latest release: 7 days ago - 18 dependent repositories - 94.4 thousand downloads last month - 3,023 stars on GitHub - 1 maintainer
yunke_langfuse 2.7.6
A client library for accessing langfuse
2 versions - Latest release: 4 months ago - 23 downloads last month - 2,823 stars on GitHub - 1 maintainer
Top 8.2% on pypi.org
langfuse 2.31.0
A client library for accessing langfuse
321 versions - Latest release: about 24 hours ago - 17 dependent packages - 1 dependent repositories - 225 thousand downloads last month - 2,823 stars on GitHub - 1 maintainer
opencompass 0.2.4
A comprehensive toolkit for large model evaluation
10 versions - Latest release: 23 days ago - 248 downloads last month - 2,659 stars on GitHub - 1 maintainer
python-grid5000 1.2.4
A python wrapper for the GitLab API.
45 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 416 downloads last month - 2,162 stars on GitHub - 2 maintainers
promptbench 0.0.2
PromptBench is a powerful tool designed to scrutinize and analyze the interaction of large langua...
6 versions - Latest release: 5 months ago - 228 downloads last month - 2,071 stars on GitHub - 1 maintainer
chainforge 0.2.6
A Visual Programming Environment for Prompt Engineering
87 versions - Latest release: 9 months ago - 1.11 thousand downloads last month - 2,018 stars on GitHub - 1 maintainer
llama-index-callbacks-uptrain 0.2.0
UpTrain Callback for performing evaluations on the LlamaIndex pipeline
3 versions - Latest release: 1 day ago - 221 downloads last month - 2,017 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
uptrain 0.7.1
UpTrain - tool to evaluate LLM applications on aspects like factual accuracy, response quality, r...
49 versions - Latest release: 1 day ago - 2 dependent packages - 1 dependent repositories - 5.57 thousand downloads last month - 2,017 stars on GitHub - 2 maintainers
Top 1.2% on pypi.org
evaluate 0.4.2
HuggingFace community-driven open-source library of evaluation
15 versions - Latest release: 16 days ago - 222 dependent packages - 2,474 dependent repositories - 2.58 million downloads last month - 1,762 stars on GitHub - 3 maintainers
Top 4.6% on pypi.org
avalanche-lib 0.5.0 💰
Avalanche: a Comprehensive Framework for Continual Learning Research
7 versions - Latest release: 3 months ago - 2 dependent packages - 10 dependent repositories - 1.21 thousand downloads last month - 1,680 stars on GitHub - 1 maintainer
Top 2.3% on pypi.org
pycm 0.9.5 💰
Multi-class confusion matrix library in Python
44 versions - Latest release: almost 6 years ago - 4 dependent packages - 50 dependent repositories - 45.4 thousand downloads last month - 1,430 stars on GitHub - 3 maintainers
Top 1.8% on pypi.org
motmetrics 1.4.0
Metrics for multiple object tracker benchmarking.
9 versions - Latest release: over 1 year ago - 13 dependent packages - 398 dependent repositories - 127 thousand downloads last month - 1,326 stars on GitHub - 1 maintainer
alpaca-eval 0.6.2
AlpacaEval : An Automatic Evaluator of Instruction-following Models
33 versions - Latest release: 27 days ago - 2 dependent packages - 7.19 thousand downloads last month - 1,062 stars on GitHub - 3 maintainers
alpaca-farm 0.2.0
An automatic evaluator for instruction-following language models. Human-validated, high-quality, ...
11 versions - Latest release: 3 months ago - 225 downloads last month - 1,062 stars on GitHub - 1 maintainer
Top 6.2% on pypi.org
xai 0.1.0
XAI - An industry-ready machine learning library that ensures explainable AI by design
5 versions - Latest release: over 2 years ago - 24 dependent repositories - 309 downloads last month - 1,053 stars on GitHub - 1 maintainer
Top 1.3% on pypi.org
sacrebleu 2.4.2
Hassle-free computation of shareable, comparable, and reproducible BLEU, chrF, and TER scores
69 versions - Latest release: about 1 month ago - 96 dependent packages - 4,263 dependent repositories - 1.88 million downloads last month - 972 stars on GitHub - 3 maintainers
sacrebleu-macrof 2.0.1
Hassle-free computation of shareable, comparable, and reproducible BLEU, chrF, and TER scores
1 version - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 31 downloads last month - 907 stars on GitHub - 1 maintainer
Top 2.2% on pypi.org
torch-fidelity 0.3.0
High-fidelity performance metrics for generative models in PyTorch
3 versions - Latest release: almost 3 years ago - 9 dependent packages - 773 dependent repositories - 282 thousand downloads last month - 870 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
rliable 1.0.8
rliable: Reliable evaluation on reinforcement learning and machine learning benchmarks.
9 versions - Latest release: almost 2 years ago - 6 dependent packages - 15 dependent repositories - 5.77 thousand downloads last month - 689 stars on GitHub - 1 maintainer
tidecv-light 1.0.1
A General Toolbox for Identifying ObjectDetection Errors
1 version - Latest release: over 1 year ago - 236 downloads last month - 687 stars on GitHub - 1 maintainer
Top 4.8% on pypi.org
tidecv 1.0.1
A General Toolbox for Identifying ObjectDetection Errors
2 versions - Latest release: over 3 years ago - 3 dependent packages - 6 dependent repositories - 2.67 thousand downloads last month - 687 stars on GitHub - 1 maintainer
Top 6.7% on pypi.org
agenta 0.14.8
The SDK for agenta is an open-source LLMOps platform.
114 versions - Latest release: 2 days ago - 2 dependent repositories - 3.76 thousand downloads last month - 575 stars on GitHub - 1 maintainer
autorag 0.1.11
Automatically Evaluate RAG pipelines with your own data. Find optimal structure for new RAG product.
26 versions - Latest release: 2 days ago - 1.15 thousand downloads last month - 515 stars on GitHub - 1 maintainer
caserecommender 1.1.1
A recommender systems framework for Python
39 versions - Latest release: over 2 years ago - 1 dependent repositories - 184 downloads last month - 454 stars on GitHub - 1 maintainer
reclist 2.1.0
RecList
7 versions - Latest release: 9 months ago - 1 dependent repositories - 98 downloads last month - 449 stars on GitHub - 2 maintainers
factuality 1.0.14
Benchmarking long-form factuality in large language models. Original code for our paper "Long-for...
4 versions - Latest release: about 1 month ago - 108 downloads last month - 444 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
simpleeval 0.9.13 💰
A simple, safe single expression evaluator library.
18 versions - Latest release: about 1 year ago - 59 dependent packages - 290 dependent repositories - 1.22 million downloads last month - 424 stars on GitHub - 1 maintainer
Top 5.1% on pypi.org
errant 3.0.0
The ERRor ANnotation Toolkit (ERRANT). Automatically extract and classify edits in parallel sente...
19 versions - Latest release: 6 months ago - 13 dependent repositories - 3.08 thousand downloads last month - 402 stars on GitHub - 4 maintainers
errant-prep 3.2.3
The ERRor ANnotation Toolkit (ERRANT). Automatically extract and classify edits in parall...
23 versions - Latest release: 4 months ago - 49 downloads last month - 402 stars on GitHub - 1 maintainer
rank-eval 0.1.3
rank_eval: A Blazing Fast Python Library for Ranking Evaluation and Comparison
5 versions - Latest release: over 2 years ago - 1 dependent repositories - 236 downloads last month - 352 stars on GitHub - 1 maintainer
Top 5.3% on pypi.org
ranx 0.3.19
ranx: A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion
45 versions - Latest release: 6 months ago - 4 dependent packages - 7 dependent repositories - 13.2 thousand downloads last month - 348 stars on GitHub - 1 maintainer
trustllm 0.3.0
TrustLLM
7 versions - Latest release: 19 days ago - 198 downloads last month - 309 stars on GitHub - 1 maintainer
lighteval 0.3.0
A lightweight and configurable evaluation package
8 versions - Latest release: about 2 months ago - 2.07 thousand downloads last month - 299 stars on GitHub - 3 maintainers
Top 4.9% on pypi.org
rexmex 0.1.3
A General Purpose Recommender Metrics Library for Fair Evaluation.
19 versions - Latest release: over 1 year ago - 1 dependent package - 9 dependent repositories - 1.85 thousand downloads last month - 275 stars on GitHub - 5 maintainers
Top 3.3% on pypi.org
pytrec-eval 0.5
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
3 versions - Latest release: over 3 years ago - 9 dependent packages - 36 dependent repositories - 104 thousand downloads last month - 249 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
pytrec-eval-terrier 0.5.6
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
6 versions - Latest release: 7 months ago - 1 dependent package - 11 dependent repositories - 31.6 thousand downloads last month - 248 stars on GitHub - 1 maintainer
pytrec-eval-git 0.5
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 10 downloads last month - 235 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
prdc 0.2
Compute precision, recall, density, and coverage metrics for two sets of vectors.
1 version - Latest release: about 4 years ago - 2 dependent packages - 17 dependent repositories - 1.37 thousand downloads last month - 234 stars on GitHub - 1 maintainer
Top 3.1% on pypi.org
langsmith 0.1.58
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform.
167 versions - Latest release: about 10 hours ago - 86 dependent packages - 2,234 dependent repositories - 8.17 million downloads last month - 217 stars on GitHub - 1 maintainer
zenoml-text-classification 0.0.2
Text Classification for Zeno
2 versions - Latest release: over 1 year ago - 1 dependent repositories - 24 downloads last month - 209 stars on GitHub - 1 maintainer
zenoml-image-classification 0.0.3
Image Classification for Zeno
3 versions - Latest release: over 1 year ago - 1 dependent repositories - 32 downloads last month - 209 stars on GitHub - 1 maintainer
zenoml 0.6.4
Interactive Evaluation Framework for Machine Learning
51 versions - Latest release: 10 months ago - 1 dependent package - 1 dependent repositories - 577 downloads last month - 208 stars on GitHub - 1 maintainer
zenoml-image-segmentation 0.0.1
Image Segmentation for Zeno
1 version - Latest release: almost 2 years ago - 1 dependent repositories - 8 downloads last month - 208 stars on GitHub - 1 maintainer
zenoml-audio-transcription 0.0.4
Audio Transcription for Zeno
4 versions - Latest release: over 1 year ago - 1 dependent repositories - 14 downloads last month - 205 stars on GitHub - 1 maintainer
Top 4.9% on pypi.org
torcheval 0.0.7
A library for providing a simple interface to create new metrics and an easy-to-use toolkit for m...
7 versions - Latest release: 9 months ago - 25 dependent packages - 8 dependent repositories - 68.4 thousand downloads last month - 194 stars on GitHub - 1 maintainer
inginious 0.8.7
An intelligent grader that allows secured and automated testing of code made by students.
17 versions - Latest release: about 1 year ago - 7 dependent repositories - 109 downloads last month - 187 stars on GitHub - 2 maintainers
Top 1.1% on pypi.org
configspace 0.7.2
Creation and manipulation of parameter configuration spaces for automated algorithm configuration...
43 versions - Latest release: 10 months ago - 31 dependent packages - 56 dependent repositories - 108 thousand downloads last month - 186 stars on GitHub - 2 maintainers
Top 6.5% on pypi.org
moverscore 1.0.3
MoverScore: Evaluating text generation with contextualized embeddings and earth mover distance
2 versions - Latest release: about 4 years ago - 10 dependent repositories - 955 downloads last month - 185 stars on GitHub - 1 maintainer
Top 7.9% on pypi.org
jury 2.2.4
Evaluation toolkit for neural language generation.
22 versions - Latest release: 11 months ago - 1 dependent package - 2 dependent repositories - 604 downloads last month - 178 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
keras-metrics 1.1.0
Metrics for Keras model evaluation
9 versions - Latest release: about 5 years ago - 41 dependent repositories - 2.77 thousand downloads last month - 166 stars on GitHub - 1 maintainer
Top 8.2% on pypi.org
torcheval-nightly 2024.5.15
A library for providing a simple interface to create new metrics and an easy-to-use toolkit for m...
481 versions - Latest release: about 13 hours ago - 2 dependent packages - 1 dependent repositories - 9.82 thousand downloads last month - 156 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
acconeer-exptool 7.10.0
Acconeer Exploration Tool
83 versions - Latest release: 22 days ago - 1 dependent repositories - 1.79 thousand downloads last month - 155 stars on GitHub - 4 maintainers
langcheck 0.7.1
Simple, Pythonic building blocks to evaluate LLM-based applications
12 versions - Latest release: 7 days ago - 2.71 thousand downloads last month - 140 stars on GitHub - 3 maintainers
fiddler-auditor 0.0.5
Auditing large language models made easy.
12 versions - Latest release: 6 months ago - 1 dependent repositories - 980 downloads last month - 138 stars on GitHub - 1 maintainer
athina 1.2.18
Python SDK to configure and run evaluations for your LLM-based application
49 versions - Latest release: 1 day ago - 1.37 thousand downloads last month - 136 stars on GitHub - 1 maintainer
replay-rec 0.16.0
RecSys Library
17 versions - Latest release: 2 months ago - 1 dependent package - 1 dependent repositories - 4.05 thousand downloads last month - 125 stars on GitHub - 1 maintainer
Top 7.2% on pypi.org
neleval 3.1.1
Command-line evaluation tools for named entity linking and (cross-document) coreference resolution
6 versions - Latest release: about 4 years ago - 2 dependent packages - 5 dependent repositories - 288 downloads last month - 115 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
xlcalculator 0.5.0
Converts MS Excel formulas to Python and evaluates them.
28 versions - Latest release: over 1 year ago - 1 dependent repositories - 12.1 thousand downloads last month - 105 stars on GitHub - 2 maintainers
evalne 0.4.0
Open Source Network Embedding Evaluation toolkit
4 versions - Latest release: almost 2 years ago - 1 dependent repositories - 11 downloads last month - 102 stars on GitHub - 1 maintainer
metriculous 0.3.0
Very unstable library containing utilities to measure and visualize statistical properties of mac...
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 132 downloads last month - 95 stars on GitHub - 3 maintainers
Top 8.7% on pypi.org
verif 1.3.0
A verification program for meteorological forecasts and observations
17 versions - Latest release: 2 months ago - 1 dependent package - 5 dependent repositories - 499 downloads last month - 81 stars on GitHub - 1 maintainer
mobile-env 2.0.1
mobile-env: An Open Environment for Autonomous Coordination in Wireless Mobile Networks
13 versions - Latest release: 10 months ago - 1 dependent repositories - 147 downloads last month - 79 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
django-access 0.1.2b2
Django-Access - the application introducing dynamic evaluation-based instance-level (row-level) a...
20 versions - Latest release: 4 months ago - 1 dependent package - 4 dependent repositories - 666 downloads last month - 76 stars on GitHub - 1 maintainer
Top 6.5% on pypi.org
table-evaluator 1.6.1
A package to evaluate how close a synthetic data set is to real data.
31 versions - Latest release: 9 months ago - 3 dependent packages - 5 dependent repositories - 1.79 thousand downloads last month - 74 stars on GitHub - 1 maintainer
disaggregators 0.1.2
HuggingFace community-driven open-source library for dataset disaggregation
3 versions - Latest release: over 1 year ago - 120 downloads last month - 66 stars on GitHub - 1 maintainer
nereval 0.2.5
Evaluation script for named entity recognition systems based on F1 score.
3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 469 downloads last month - 66 stars on GitHub - 1 maintainer
reclab 0.1.2
A simulation framework for recommender systems.
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 6 downloads last month - 63 stars on GitHub - 1 maintainer
llmuses 0.3.0
Eval-Scope: Lightweight LLMs Evaluation Framework
8 versions - Latest release: about 1 month ago - 1 dependent package - 583 downloads last month - 63 stars on GitHub - 1 maintainer
pycyclops 0.2.8
Framework for healthcare ML implementation
50 versions - Latest release: about 11 hours ago - 1 dependent package - 742 downloads last month - 62 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
smatch 1.0.4
Smatch (semantic match) tool
5 versions - Latest release: almost 4 years ago - 2 dependent packages - 14 dependent repositories - 871 downloads last month - 62 stars on GitHub - 2 maintainers
lmchallenge 5.2
LM Challenge - A library & tools to evaluate predictive language models.
4 versions - Latest release: over 4 years ago - 1 dependent repositories - 17 downloads last month - 60 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
mcdm 1.4
Python implementation of multiple-criteria decision-making algorithms
5 versions - Latest release: almost 2 years ago - 2 dependent packages - 2 dependent repositories - 375 downloads last month - 59 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
pymia 0.3.2
A Python package for data handling and evaluation in deep learning-based medical image analysis.
10 versions - Latest release: about 2 years ago - 9 dependent repositories - 745 downloads last month - 57 stars on GitHub - 2 maintainers
dinglehopper 0.9.6
The OCR evaluation tool
7 versions - Latest release: 9 days ago - 157 downloads last month - 54 stars on GitHub - 1 maintainer
reseval 0.1.6
Reproducible Subjective Evaluation
9 versions - Latest release: 2 months ago - 1 dependent package - 1 dependent repositories - 78 downloads last month - 53 stars on GitHub - 1 maintainer
booookscore 0.1.3
Official package for our ICLR 2024 paper, "BooookScore: A systematic exploration of book-length s...
4 versions - Latest release: about 1 month ago - 140 downloads last month - 53 stars on GitHub - 1 maintainer
meeteval 0.3.0
MeetEval - A meeting transcription evaluation toolkit
5 versions - Latest release: 29 days ago - 175 downloads last month - 53 stars on GitHub - 1 maintainer
flare22-dsc-nsd-test 0.0.2
FLARE22_DSC_NSD_Evaluation
2 versions - Latest release: almost 2 years ago - 21 downloads last month - 52 stars on GitHub - 1 maintainer
lusmu 0.2
A lazy/forced evaluation library
1 version - Latest release: over 10 years ago - 2 dependent repositories - 7 downloads last month - 52 stars on GitHub - 1 maintainer
nubia-score 0.1.5
NUBIA (NeUral Based Interchangeability Assessor) is a SoTA evaluation metric for text generation
6 versions - Latest release: about 3 years ago - 1 dependent repositories - 68 downloads last month - 51 stars on GitHub - 1 maintainer
hydrotools.gcp-client 4.1.2
Retrieve National Water Model data from Google Cloud Platform.
6 versions - Latest release: almost 3 years ago - 48 downloads last month - 49 stars on GitHub - 3 maintainers
Top 10.0% on pypi.org
hydrotools.metrics 1.3.3
Variety of standard model evaluation metrics.
10 versions - Latest release: almost 2 years ago - 4 dependent repositories - 847 downloads last month - 49 stars on GitHub - 3 maintainers
hydrotools.-restclient 3.1.0
General REST api client with built in request caching and retries.
8 versions - Latest release: 7 months ago - 3 dependent packages - 49 stars on GitHub - 3 maintainers
hydrotools 2.2.2
Suite of tools for retrieving USGS NWIS observations and evaluating National Water Model (NWM) data.
3 versions - Latest release: over 2 years ago - 1 dependent repositories - 84 downloads last month - 49 stars on GitHub - 1 maintainer
audio-degrader 1.3.1
Tool to introduce controlled degradations to audio
13 versions - Latest release: over 3 years ago - 120 downloads last month - 49 stars on GitHub - 1 maintainer
django-estimators 0.2.1
A django model to persist and track machine learning models
3 versions - Latest release: over 7 years ago - 6 dependent repositories - 16 downloads last month - 49 stars on GitHub - 1 maintainer
hydrotools.nwm-client-new 7.4.0
Retrieve National Water Model data from various sources.
3 versions - Latest release: about 2 months ago - 38 downloads last month - 49 stars on GitHub - 1 maintainer
hydrotools.svi-client 0.0.2
Retrieve Social Vulnerability Index data from The Center for Disease Control / The Agency for Tox...
2 versions - Latest release: 7 months ago - 24 downloads last month - 49 stars on GitHub - 1 maintainer
hydrotools.events 1.1.5
Various methods to support event-based evaluations.
5 versions - Latest release: over 2 years ago - 103 downloads last month - 49 stars on GitHub - 3 maintainers
Top 8.5% on pypi.org
hydrotools.nwis-client 3.3.1
A convenient interface to the USGS NWIS Instantaneous Values (IV) REST Service API.
12 versions - Latest release: 12 months ago - 7 dependent repositories - 885 downloads last month - 49 stars on GitHub - 3 maintainers
hydrotools.nwm-client 5.0.3
Retrieve National Water Model data from various sources.
3 versions - Latest release: about 2 years ago - 96 downloads last month - 49 stars on GitHub - 1 maintainer
hydrotools.caches 0.1.3
Variety of object caching utilities for OWPHydroTools.
3 versions - Latest release: almost 3 years ago - 103 downloads last month - 49 stars on GitHub - 2 maintainers
eccv-caption 0.1.0
A PyThon wrapper for Extended COCO Validation (ECCV) Caption dataset
1 version - Latest release: about 2 years ago - 45 downloads last month - 48 stars on GitHub - 1 maintainer
Top 8.9% on pypi.org
bcubed 1.5
Simple extended BCubed implementation in Python for clustering evaluation
1 version - Latest release: over 5 years ago - 12 dependent repositories - 567 downloads last month - 46 stars on GitHub - 1 maintainer