inferentia | pypi.org keywords | Ecosyste.ms: Packages

pypi.org "inferentia" keyword

View the packages on the pypi.org package registry that are tagged with the "inferentia" keyword.

fmbt 1.0.7

Benchmark performance of **any model** on **any supported instance type** on Amazon SageMaker.
8 versions - Latest release: about 1 year ago - 344 downloads last month - 77 stars on GitHub - 2 maintainers

Top 3.4% on pypi.org

vllm 0.8.4 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
56 versions - Latest release: 5 days ago - 46 dependent packages - 5 dependent repositories - 2.31 million downloads last month - 25,904 stars on GitHub - 4 maintainers

byzerllm 0.1.181 💰

ByzerLLM: Byzer LLM
177 versions - Latest release: 13 days ago - 1 dependent package - 2 dependent repositories - 15.4 thousand downloads last month - 25,904 stars on GitHub - 1 maintainer

hive-vllm 0.0.1 💰

a
1 version - Latest release: about 1 year ago - 36 downloads last month - 25,551 stars on GitHub - 1 maintainer

vllm-xft 0.5.5.3 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
11 versions - Latest release: about 1 month ago - 358 downloads last month - 25,904 stars on GitHub - 2 maintainers

moe-kernels 0.8.2 💰

MoE kernels
15 versions - Latest release: 3 months ago - 319 downloads last month - 25,904 stars on GitHub - 1 maintainer

llm_math 0.2.0 💰

A tool designed to evaluate the performance of large language models on mathematical tasks.
5 versions - Latest release: 6 months ago - 104 downloads last month - 44,312 stars on GitHub - 1 maintainer

nextai-vllm 0.0.7 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
6 versions - Latest release: 12 months ago - 147 downloads last month - 25,904 stars on GitHub - 1 maintainer

ai-dynamo-vllm 0.7.2 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
2 versions - Latest release: about 1 month ago - 1.85 thousand downloads last month - 44,312 stars on GitHub - 1 maintainer

tilearn-test01 0.1 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
1 version - Latest release: about 1 year ago - 25 downloads last month - 25,700 stars on GitHub - 1 maintainer

wxy-test 0.8.1 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
1 version - Latest release: 2 months ago - 38 downloads last month - 44,312 stars on GitHub - 1 maintainer

vllm-rocm 0.6.3 💰

A high-throughput and memory-efficient inference and serving engine for LLMs with AMD GPU support
1 version - Latest release: 6 months ago - 44 downloads last month - 44,312 stars on GitHub - 1 maintainer

vllm-acc 0.4.1 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
8 versions - Latest release: 12 months ago - 226 downloads last month - 25,904 stars on GitHub - 1 maintainer

vllm-emissary 0.1.0 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
2 versions - Latest release: 12 days ago - 211 downloads last month - 44,312 stars on GitHub - 1 maintainer

llm_atc 0.1.7 💰

Tools for fine tuning and serving LLMs
6 versions - Latest release: over 1 year ago - 238 downloads last month - 25,904 stars on GitHub - 1 maintainer

vllm-online 0.4.2 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
2 versions - Latest release: 12 months ago - 54 downloads last month - 25,904 stars on GitHub - 1 maintainer

marlin-kernels 0.3.7 💰

Marlin quantization kernels
11 versions - Latest release: 3 months ago - 244 downloads last month - 25,904 stars on GitHub - 1 maintainer

vllm-npu 0.4.2 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
3 versions - Latest release: 3 months ago - 167 downloads last month - 44,312 stars on GitHub - 1 maintainer

llm-engines 0.0.23 💰

A unified inference engine for large language models (LLMs) including open-source models (VLLM, S...
22 versions - Latest release: about 1 month ago - 774 downloads last month - 25,904 stars on GitHub - 1 maintainer

tilearn-infer 0.3.3 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
3 versions - Latest release: 12 months ago - 68 downloads last month - 25,904 stars on GitHub - 1 maintainer

llm-swarm 0.1.1 💰

A high-throughput and memory-efficient inference and serving engine for LLMs
2 versions - Latest release: about 1 year ago - 89 downloads last month - 25,904 stars on GitHub - 1 maintainer

superlaser 0.0.6 💰

An MLOps library for LLM deployment w/ the vLLM engine on RunPod's infra.
6 versions - Latest release: about 1 year ago - 229 downloads last month - 25,904 stars on GitHub - 1 maintainer

optimum-neuron 0.1.0

Optimum Neuron is the interface between the Hugging Face Transformers and Diffusers libraries and...
30 versions - Latest release: about 1 month ago - 1 dependent repositories - 159 thousand downloads last month - 226 stars on GitHub - 3 maintainers

fmbench 2.1.6

Benchmark performance of **any Foundation Model (FM)** deployed on **any AWS Generative AI servic...
83 versions - Latest release: 8 days ago - 2.44 thousand downloads last month - 77 stars on GitHub - 1 maintainer

aphrodite-engine 0.6.5 💰

The inference engine for PygmalionAI models
29 versions - Latest release: 4 months ago - 2.42 thousand downloads last month - 1,374 stars on GitHub - 1 maintainer

Related Keywords

trainium 22 tpu 22 rocm 22 cuda 22 inference 22 pytorch 21 model-serving 21 mlops 21 llmops 21 llm-serving 21 llm 21 llama 21 gpt 21 amd 21 transformer 21 xpu 21 deepseek 6 hpu 6 qwen 6 sagemaker 2 p4d 2 llama2 2 generative-ai 2 foundation-models 2 benchmarking 2 benchmark 2 bedrock 2 fine-tuning 1 aws 1 bring your own endpoint 1 llama3 1 ec2 1 eks 1 api-rest 1 inference-engine 1 intel 1 lora 1 machine-learning 1 speculative-decoding 1 mixed-precision training 1 diffusers 1 transformers 1 cicd 1 runpod 1 vllm 1 deployment 1 MLOps 1 NLP 1 LLM 1 server 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Packages