Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "evaluation" keyword

Top 3.1% on pypi.org
langsmith 0.1.58
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform.
167 versions - Latest release: about 6 hours ago - 86 dependent packages - 2,234 dependent repositories - 8.17 million downloads last month - 217 stars on GitHub - 1 maintainer
pycyclops 0.2.8
Framework for healthcare ML implementation
50 versions - Latest release: about 7 hours ago - 1 dependent package - 742 downloads last month - 62 stars on GitHub - 1 maintainer
Top 8.2% on pypi.org
torcheval-nightly 2024.5.15
A library for providing a simple interface to create new metrics and an easy-to-use toolkit for m...
481 versions - Latest release: about 9 hours ago - 2 dependent packages - 1 dependent repositories - 9.82 thousand downloads last month - 156 stars on GitHub - 1 maintainer
Top 8.2% on pypi.org
langfuse 2.31.0
A client library for accessing langfuse
321 versions - Latest release: about 20 hours ago - 17 dependent packages - 1 dependent repositories - 225 thousand downloads last month - 2,823 stars on GitHub - 1 maintainer
Top 8.9% on pypi.org
codebleu 0.6.1
Unofficial CodeBLEU implementation that supports Linux, MacOS and Windows available on PyPI.
13 versions - Latest release: 1 day ago - 3 dependent repositories - 1.85 thousand downloads last month - 31 stars on GitHub - 1 maintainer
llama-index-callbacks-uptrain 0.2.0
UpTrain Callback for performing evaluations on the LlamaIndex pipeline
3 versions - Latest release: 1 day ago - 221 downloads last month - 2,017 stars on GitHub - 1 maintainer
athina 1.2.18
Python SDK to configure and run evaluations for your LLM-based application
49 versions - Latest release: 1 day ago - 1.37 thousand downloads last month - 136 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
uptrain 0.7.1
UpTrain - tool to evaluate LLM applications on aspects like factual accuracy, response quality, r...
49 versions - Latest release: 1 day ago - 2 dependent packages - 1 dependent repositories - 5.57 thousand downloads last month - 2,017 stars on GitHub - 2 maintainers
autorag 0.1.11
Automatically Evaluate RAG pipelines with your own data. Find optimal structure for new RAG product.
26 versions - Latest release: 1 day ago - 1.15 thousand downloads last month - 515 stars on GitHub - 1 maintainer
kolena-client 1.18.0
Client for Kolena's machine learning testing platform.
70 versions - Latest release: 2 days ago - 1.91 thousand downloads last month - 38 stars on GitHub - 1 maintainer
kolena 1.18.0
Client for Kolena's machine learning testing platform.
65 versions - Latest release: 2 days ago - 1 dependent repositories - 7.63 thousand downloads last month - 38 stars on GitHub - 1 maintainer
Top 6.7% on pypi.org
agenta 0.14.8
The SDK for agenta is an open-source LLMOps platform.
114 versions - Latest release: 2 days ago - 2 dependent repositories - 3.76 thousand downloads last month - 575 stars on GitHub - 1 maintainer
dyff 0.18.0
Meta-package to install the local SDK for the Dyff AI auditing platform.
20 versions - Latest release: 5 days ago - 398 downloads last month - 5 maintainers
dyff-client 0.5.0
Python client for the Dyff AI auditing platform.
12 versions - Latest release: 5 days ago - 2 dependent packages - 769 downloads last month - 0 stars on GitLab.com - 5 maintainers
dyff-schema 0.5.3
Data models for the Dyff AI auditing platform.
22 versions - Latest release: 5 days ago - 4 dependent packages - 1.27 thousand downloads last month - 0 stars on GitLab.com - 5 maintainers
redlite 0.2.0
LLM testing on steroids
58 versions - Latest release: 5 days ago - 488 downloads last month - 0 stars on GitHub - 1 maintainer
promptmodel 0.1.19
Prompt & model versioning on the cloud, built for developers.
50 versions - Latest release: 6 days ago - 347 downloads last month - 11 stars on GitHub - 2 maintainers
Top 3.5% on pypi.org
evo 1.28.0
Python package for the evaluation of odometry and SLAM
100 versions - Latest release: 6 days ago - 18 dependent repositories - 94.4 thousand downloads last month - 3,023 stars on GitHub - 1 maintainer
langcheck 0.7.1
Simple, Pythonic building blocks to evaluate LLM-based applications
12 versions - Latest release: 7 days ago - 2.71 thousand downloads last month - 140 stars on GitHub - 3 maintainers
trajectopy 2.0.14
Trajectory Evaluation in Python
43 versions - Latest release: 7 days ago - 838 downloads last month - 21 stars on GitHub - 1 maintainer
trajectopy-core 3.1.0
Trajectory Evaluation in Python
46 versions - Latest release: 7 days ago - 2 dependent packages - 883 downloads last month - 1 stars on GitHub - 1 maintainer
dinglehopper 0.9.6
The OCR evaluation tool
7 versions - Latest release: 9 days ago - 157 downloads last month - 54 stars on GitHub - 1 maintainer
picai-eval 1.4.6
PICAI Evaluation
5 versions - Latest release: 9 days ago - 1 dependent package - 829 downloads last month - 15 stars on GitHub - 1 maintainer
mlrl-testbed 0.10.0
Provides utilities for the training and evaluation of multi-label rule learning algorithms
7 versions - Latest release: 11 days ago - 1 dependent repositories - 39 downloads last month - 0 stars on GitHub - 1 maintainer
calldict 0.11
Protocol to markup and evaluate functions in data structures
9 versions - Latest release: 11 days ago - 1 dependent repositories - 227 downloads last month - 2 stars on GitHub - 1 maintainer
identitychain 0.1.0
Evaluation Framework for Code Large Language Models (Code LLMs)
2 versions - Latest release: 12 days ago - 7 downloads last month - 6 stars on GitHub - 1 maintainer
umbrela 0.0.7
A Package for generating query-passage relevance assessment labels.
7 versions - Latest release: 14 days ago - 508 downloads last month - 0 stars on GitHub - 1 maintainer
frd-score 0.0.1
Package for calculating FrΓ©chet Radiomics Distance (FRD)
1 version - Latest release: 14 days ago - 79 downloads last month - 0 stars on GitHub - 1 maintainer
quotientai 0.0.4
CLI for evaluating large language models with Quotient
4 versions - Latest release: 15 days ago - 300 downloads last month - 1 maintainer
Top 1.2% on pypi.org
evaluate 0.4.2
HuggingFace community-driven open-source library of evaluation
15 versions - Latest release: 15 days ago - 222 dependent packages - 2,474 dependent repositories - 2.58 million downloads last month - 1,762 stars on GitHub - 3 maintainers
tsml-eval 0.3.0
A package for benchmarking time series machine learning tools.
7 versions - Latest release: 16 days ago - 179 downloads last month - 20 stars on GitHub - 1 maintainer
json-criteria 0.2.0
Python library designed for evaluating data against serializable JSON criteria
6 versions - Latest release: 18 days ago - 25 downloads last month - 1 stars on GitHub - 1 maintainer
trustllm 0.3.0
TrustLLM
7 versions - Latest release: 19 days ago - 198 downloads last month - 309 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
coconut-develop 3.1.0.post0.dev11 πŸ’°
Simple, elegant, Pythonic functional programming.
655 versions - Latest release: 20 days ago - 2 dependent repositories - 3.58 thousand downloads last month - 3,951 stars on GitHub - 2 maintainers
python-adc-eval 0.2.0
ADC Evaluation Library
8 versions - Latest release: 21 days ago - 142 downloads last month - 1 stars on GitHub - 1 maintainer
dyff-audit 0.3.1
Audit tools for the Dyff AI auditing platform.
12 versions - Latest release: 22 days ago - 1 dependent package - 355 downloads last month - 0 stars on GitLab.com - 5 maintainers
Top 9.7% on pypi.org
acconeer-exptool 7.10.0
Acconeer Exploration Tool
83 versions - Latest release: 22 days ago - 1 dependent repositories - 1.79 thousand downloads last month - 155 stars on GitHub - 4 maintainers
opencompass 0.2.4
A comprehensive toolkit for large model evaluation
10 versions - Latest release: 23 days ago - 248 downloads last month - 2,659 stars on GitHub - 1 maintainer
factscorelite 1.3.0
FactScore (Fine-grained atomic evaluation of factual precision in long form text generation) comp...
10 versions - Latest release: 23 days ago - 882 downloads last month - 0 stars on GitHub - 1 maintainer
sed-scores-eval 0.0.3
(Threshold-Independent) Evaluation of Sound Event Detection Scores
4 versions - Latest release: 25 days ago - 368 downloads last month - 22 stars on GitHub - 1 maintainer
alpaca-eval 0.6.2
AlpacaEval : An Automatic Evaluator of Instruction-following Models
33 versions - Latest release: 27 days ago - 2 dependent packages - 7.19 thousand downloads last month - 1,062 stars on GitHub - 3 maintainers
ntqr 0.3.2
Tools for the logic of evaluation using unlabeled data
5 versions - Latest release: 27 days ago - 254 downloads last month - 34 stars on GitHub - 1 maintainer
panoptica 0.6.5
Panoptic Quality (PQ) computation for binary masks.
53 versions - Latest release: 27 days ago - 1 dependent repositories - 324 downloads last month - 12 stars on GitHub - 1 maintainer
meeteval 0.3.0
MeetEval - A meeting transcription evaluation toolkit
5 versions - Latest release: 29 days ago - 175 downloads last month - 53 stars on GitHub - 1 maintainer
maihem 1.4.2
LLM evaluations and synthetic data generation with the MAIHEM models
8 versions - Latest release: 30 days ago - 121 downloads last month - 12 stars on GitHub - 1 maintainer
Top 1.3% on pypi.org
sacrebleu 2.4.2
Hassle-free computation of shareable, comparable, and reproducible BLEU, chrF, and TER scores
69 versions - Latest release: about 1 month ago - 96 dependent packages - 4,263 dependent repositories - 1.88 million downloads last month - 972 stars on GitHub - 3 maintainers
llmuses 0.3.0
Eval-Scope: Lightweight LLMs Evaluation Framework
8 versions - Latest release: about 1 month ago - 1 dependent package - 583 downloads last month - 63 stars on GitHub - 1 maintainer
llama-index-packs-llama-dataset-metadata 0.1.4
llama-index packs llama_dataset_metadata integration
5 versions - Latest release: about 1 month ago - 53 downloads last month - 1 maintainer
faster-etapr 0.1.2
Faster implementation of the enhanced time-aware precision and recall (eTaPR) for the evaluation ...
3 versions - Latest release: about 1 month ago - 110 downloads last month - 1 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
enoslib 9.2.0
226 versions - Latest release: about 1 month ago - 1 dependent package - 3 dependent repositories - 1.18 thousand downloads last month - 3 maintainers
booookscore 0.1.3
Official package for our ICLR 2024 paper, "BooookScore: A systematic exploration of book-length s...
4 versions - Latest release: about 1 month ago - 140 downloads last month - 53 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
fore 0.1.7
fore ai packages
9 versions - Latest release: about 1 month ago - 4 dependent repositories - 79.5 thousand downloads last month - 10 stars on GitHub - 1 maintainer
Top 7.7% on pypi.org
insight 1.0
A python library for monitoring, comparing and extracting insights from data.
23 versions - Latest release: about 1 month ago - 5 dependent repositories - 6.59 thousand downloads last month - 12 stars on GitHub - 1 maintainer
factuality 1.0.14
Benchmarking long-form factuality in large language models. Original code for our paper "Long-for...
4 versions - Latest release: about 1 month ago - 108 downloads last month - 444 stars on GitHub - 1 maintainer
lighteval 0.3.0
A lightweight and configurable evaluation package
8 versions - Latest release: about 2 months ago - 2.07 thousand downloads last month - 299 stars on GitHub - 3 maintainers
hydrotools.nwm-client-new 7.4.0
Retrieve National Water Model data from various sources.
3 versions - Latest release: about 2 months ago - 38 downloads last month - 49 stars on GitHub - 1 maintainer
rag-eval 0.1.3
A RAG evaluation framework
4 versions - Latest release: about 2 months ago - 471 downloads last month - 1 maintainer
multimedeval 0.1.1
A Python tool to evaluate the performance of VLM on the medical domain.
2 versions - Latest release: about 2 months ago - 56 downloads last month - 18 stars on GitHub - 1 maintainer
synthesized-datasets 1.7
Publically available datasets for benchmarking and evaluation.
15 versions - Latest release: 2 months ago - 3 dependent packages - 7.72 thousand downloads last month - 1 stars on GitHub - 1 maintainer
replay-rec 0.16.0
RecSys Library
17 versions - Latest release: 2 months ago - 1 dependent package - 1 dependent repositories - 4.05 thousand downloads last month - 125 stars on GitHub - 1 maintainer
Top 8.7% on pypi.org
verif 1.3.0
A verification program for meteorological forecasts and observations
17 versions - Latest release: 2 months ago - 1 dependent package - 5 dependent repositories - 499 downloads last month - 81 stars on GitHub - 1 maintainer
tieval 0.1.2
A framework for evaluation and development of temporal-aware models.
11 versions - Latest release: 2 months ago - 1 dependent repositories - 72 downloads last month - 14 stars on GitHub - 1 maintainer
easy-lm-eval 0.1.2
A library for easy evaluation of language models
3 versions - Latest release: 2 months ago - 20 downloads last month - 3 stars on GitHub - 1 maintainer
reseval 0.1.6
Reproducible Subjective Evaluation
9 versions - Latest release: 2 months ago - 1 dependent package - 1 dependent repositories - 78 downloads last month - 53 stars on GitHub - 1 maintainer
Top 3.6% on pypi.org
coconut 3.1.0 πŸ’°
Simple, elegant, Pythonic functional programming.
41 versions - Latest release: 2 months ago - 3 dependent packages - 22 dependent repositories - 2.81 thousand downloads last month - 3,951 stars on GitHub - 1 maintainer
id-marl-eval 0.0.4
A Python library for Multi-Agent Reinforcement Learning evaluation.
4 versions - Latest release: 3 months ago - 39 downloads last month - 44 stars on GitHub - 1 maintainer
audmetric 1.2.1
Evaluate machine-learning models
11 versions - Latest release: 3 months ago - 2 dependent packages - 2 dependent repositories - 1.56 thousand downloads last month - 1 stars on GitHub - 1 maintainer
tno.sdg.tabular.eval.utility-metrics 0.3.0
Utility metrics for tabular data
1 version - Latest release: 3 months ago - 8 downloads last month - 2 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
avalanche-lib 0.5.0 πŸ’°
Avalanche: a Comprehensive Framework for Continual Learning Research
7 versions - Latest release: 3 months ago - 2 dependent packages - 10 dependent repositories - 1.21 thousand downloads last month - 1,680 stars on GitHub - 1 maintainer
ragrank 0.1.0
An evaluation library for RAG models
7 versions - Latest release: 3 months ago - 134 downloads last month - 17 stars on GitHub - 1 maintainer
alpaca-farm 0.2.0
An automatic evaluator for instruction-following language models. Human-validated, high-quality, ...
11 versions - Latest release: 3 months ago - 225 downloads last month - 1,062 stars on GitHub - 1 maintainer
llama-index-packs-rag-evaluator 0.1.3
llama-index packs rag_evaluator integration
5 versions - Latest release: 3 months ago - 65 downloads last month - 1 maintainer
phasellm 0.0.21
Wrappers for common large language models (LLMs) with support for evaluation.
21 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 414 downloads last month - 1 maintainer
ctxpro 0.0.5
Simple toolkit that extracts ambiguities in documents that require context to resolve.
5 versions - Latest release: 3 months ago - 21 downloads last month - 0 stars on GitHub - 1 maintainer
mt-thresholds 0.0.4
Tool to check how metric deltas for machine translation reflect on system-level human accuracies.
4 versions - Latest release: 3 months ago - 40 downloads last month - 2 stars on GitHub - 1 maintainer
lighthouz 0.0.5
Lighthouz AI Python SDK
3 versions - Latest release: 3 months ago - 54 downloads last month - 3 stars on GitHub - 1 maintainer
yunke_langfuse 2.7.6
A client library for accessing langfuse
2 versions - Latest release: 4 months ago - 23 downloads last month - 2,823 stars on GitHub - 1 maintainer
errant-prep 3.2.3
The ERRor ANnotation Toolkit (ERRANT). Automatically extract and classify edits in parall...
23 versions - Latest release: 4 months ago - 49 downloads last month - 402 stars on GitHub - 1 maintainer
jurity 2.0.1
fairness and evaluation library
12 versions - Latest release: 4 months ago - 1 dependent package - 4 dependent repositories - 829 downloads last month - 35 stars on GitHub - 5 maintainers
django-access-tastypie 0.1.0b2
Django-Access-Tastypie - the application introducing authorization module for the Tastypie packag...
5 versions - Latest release: 4 months ago - 1 dependent repositories - 127 downloads last month - 3 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
django-access 0.1.2b2
Django-Access - the application introducing dynamic evaluation-based instance-level (row-level) a...
20 versions - Latest release: 4 months ago - 1 dependent package - 4 dependent repositories - 666 downloads last month - 76 stars on GitHub - 1 maintainer
v-stream 0.1.2
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models
8 versions - Latest release: 4 months ago - 47 downloads last month - 9 stars on GitHub - 1 maintainer
waffle-hub 0.3.1
Waffle hub
31 versions - Latest release: 4 months ago - 192 downloads last month - 39 stars on GitHub - 1 maintainer
econll 0.2.5
Extended CoNLL Utilities for Shallow Parsing
8 versions - Latest release: 4 months ago - 52 downloads last month - 2 stars on GitHub - 1 maintainer
pyevaldata 1.6.0
Python module to evaluate experimental data
17 versions - Latest release: 5 months ago - 1 dependent repositories - 120 downloads last month - 7 stars on GitHub - 1 maintainer
tiger-eval 0.0.2
Text Generation Evaluation Toolkit
2 versions - Latest release: 5 months ago - 20 downloads last month - 1 maintainer
promptbench 0.0.2
PromptBench is a powerful tool designed to scrutinize and analyze the interaction of large langua...
6 versions - Latest release: 5 months ago - 228 downloads last month - 2,071 stars on GitHub - 1 maintainer
er-evaluation 2.3.0 πŸ’°
An End-to-End Evaluation Framework for Entity Resolution Systems.
9 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 103 downloads last month - 9 stars on GitHub - 1 maintainer
Top 5.3% on pypi.org
ranx 0.3.19
ranx: A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion
45 versions - Latest release: 6 months ago - 4 dependent packages - 7 dependent repositories - 13.2 thousand downloads last month - 348 stars on GitHub - 1 maintainer
sensirion-uart-svm4x 2.0.3
SHDLC driver for the Sensirion SVM4X sensor family
1 version - Latest release: 6 months ago - 14 downloads last month - 0 stars on GitHub - 1 maintainer
ragstack-ai-langsmith 0.0.1a1 removed
Client library to connect to the LangSmith LLM Tracing and Evaluation Platform.
1 version - Latest release: 6 months ago - 1 maintainer
fiddler-auditor 0.0.5
Auditing large language models made easy.
12 versions - Latest release: 6 months ago - 1 dependent repositories - 980 downloads last month - 138 stars on GitHub - 1 maintainer
rke-score 0.0.7
Compute Renyi Kernel Entropy scores (RKE-MC and RRKE) for two sets of vectors.
5 versions - Latest release: 6 months ago - 34 downloads last month - 9 stars on GitHub - 1 maintainer
metric-eval 1.0.2
a python package for evaluating evaluation metrics
3 versions - Latest release: 6 months ago - 14 downloads last month - 6 stars on GitHub - 1 maintainer
Top 5.1% on pypi.org
errant 3.0.0
The ERRor ANnotation Toolkit (ERRANT). Automatically extract and classify edits in parallel sente...
19 versions - Latest release: 6 months ago - 13 dependent repositories - 3.08 thousand downloads last month - 402 stars on GitHub - 4 maintainers
hydrotools.-restclient 3.1.0
General REST api client with built in request caching and retries.
8 versions - Latest release: 7 months ago - 3 dependent packages - 49 stars on GitHub - 3 maintainers
hydrotools.svi-client 0.0.2
Retrieve Social Vulnerability Index data from The Center for Disease Control / The Agency for Tox...
2 versions - Latest release: 7 months ago - 24 downloads last month - 49 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
pytrec-eval-terrier 0.5.6
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
6 versions - Latest release: 7 months ago - 1 dependent package - 11 dependent repositories - 31.6 thousand downloads last month - 248 stars on GitHub - 1 maintainer
evalpm 0.1.2
A framework for creating and evaluating immission models for Particulate Matter
3 versions - Latest release: 7 months ago - 9 downloads last month - 2 stars on GitHub - 1 maintainer
spiraleval 0.1.2 removed
Evaluation for characteristics
3 versions - Latest release: 7 months ago - 249 downloads last month - 1 maintainer