pypi.org "inferentia" keyword
View the packages on the pypi.org package registry that are tagged with the "inferentia" keyword.
fmbt 1.0.7
Benchmark performance of **any model** on **any supported instance type** on Amazon SageMaker.8 versions - Latest release: about 1 year ago - 344 downloads last month - 77 stars on GitHub - 2 maintainers
Top 3.4% on pypi.org
56 versions - Latest release: 5 days ago - 46 dependent packages - 5 dependent repositories - 2.31 million downloads last month - 25,904 stars on GitHub - 4 maintainers
vllm 0.8.4 💰
A high-throughput and memory-efficient inference and serving engine for LLMs56 versions - Latest release: 5 days ago - 46 dependent packages - 5 dependent repositories - 2.31 million downloads last month - 25,904 stars on GitHub - 4 maintainers
byzerllm 0.1.181 💰
ByzerLLM: Byzer LLM177 versions - Latest release: 13 days ago - 1 dependent package - 2 dependent repositories - 15.4 thousand downloads last month - 25,904 stars on GitHub - 1 maintainer
hive-vllm 0.0.1 💰
a1 version - Latest release: about 1 year ago - 36 downloads last month - 25,551 stars on GitHub - 1 maintainer
vllm-xft 0.5.5.3 💰
A high-throughput and memory-efficient inference and serving engine for LLMs11 versions - Latest release: about 1 month ago - 358 downloads last month - 25,904 stars on GitHub - 2 maintainers
moe-kernels 0.8.2 💰
MoE kernels15 versions - Latest release: 3 months ago - 319 downloads last month - 25,904 stars on GitHub - 1 maintainer
llm_math 0.2.0 💰
A tool designed to evaluate the performance of large language models on mathematical tasks.5 versions - Latest release: 6 months ago - 104 downloads last month - 44,312 stars on GitHub - 1 maintainer
nextai-vllm 0.0.7 💰
A high-throughput and memory-efficient inference and serving engine for LLMs6 versions - Latest release: 12 months ago - 147 downloads last month - 25,904 stars on GitHub - 1 maintainer
ai-dynamo-vllm 0.7.2 💰
A high-throughput and memory-efficient inference and serving engine for LLMs2 versions - Latest release: about 1 month ago - 1.85 thousand downloads last month - 44,312 stars on GitHub - 1 maintainer
tilearn-test01 0.1 💰
A high-throughput and memory-efficient inference and serving engine for LLMs1 version - Latest release: about 1 year ago - 25 downloads last month - 25,700 stars on GitHub - 1 maintainer
wxy-test 0.8.1 💰
A high-throughput and memory-efficient inference and serving engine for LLMs1 version - Latest release: 2 months ago - 38 downloads last month - 44,312 stars on GitHub - 1 maintainer
vllm-rocm 0.6.3 💰
A high-throughput and memory-efficient inference and serving engine for LLMs with AMD GPU support1 version - Latest release: 6 months ago - 44 downloads last month - 44,312 stars on GitHub - 1 maintainer
vllm-acc 0.4.1 💰
A high-throughput and memory-efficient inference and serving engine for LLMs8 versions - Latest release: 12 months ago - 226 downloads last month - 25,904 stars on GitHub - 1 maintainer
vllm-emissary 0.1.0 💰
A high-throughput and memory-efficient inference and serving engine for LLMs2 versions - Latest release: 12 days ago - 211 downloads last month - 44,312 stars on GitHub - 1 maintainer
llm_atc 0.1.7 💰
Tools for fine tuning and serving LLMs6 versions - Latest release: over 1 year ago - 238 downloads last month - 25,904 stars on GitHub - 1 maintainer
vllm-online 0.4.2 💰
A high-throughput and memory-efficient inference and serving engine for LLMs2 versions - Latest release: 12 months ago - 54 downloads last month - 25,904 stars on GitHub - 1 maintainer
marlin-kernels 0.3.7 💰
Marlin quantization kernels11 versions - Latest release: 3 months ago - 244 downloads last month - 25,904 stars on GitHub - 1 maintainer
vllm-npu 0.4.2 💰
A high-throughput and memory-efficient inference and serving engine for LLMs3 versions - Latest release: 3 months ago - 167 downloads last month - 44,312 stars on GitHub - 1 maintainer
llm-engines 0.0.23 💰
A unified inference engine for large language models (LLMs) including open-source models (VLLM, S...22 versions - Latest release: about 1 month ago - 774 downloads last month - 25,904 stars on GitHub - 1 maintainer
tilearn-infer 0.3.3 💰
A high-throughput and memory-efficient inference and serving engine for LLMs3 versions - Latest release: 12 months ago - 68 downloads last month - 25,904 stars on GitHub - 1 maintainer
llm-swarm 0.1.1 💰
A high-throughput and memory-efficient inference and serving engine for LLMs2 versions - Latest release: about 1 year ago - 89 downloads last month - 25,904 stars on GitHub - 1 maintainer
superlaser 0.0.6 💰
An MLOps library for LLM deployment w/ the vLLM engine on RunPod's infra.6 versions - Latest release: about 1 year ago - 229 downloads last month - 25,904 stars on GitHub - 1 maintainer
optimum-neuron 0.1.0
Optimum Neuron is the interface between the Hugging Face Transformers and Diffusers libraries and...30 versions - Latest release: about 1 month ago - 1 dependent repositories - 159 thousand downloads last month - 226 stars on GitHub - 3 maintainers
fmbench 2.1.6
Benchmark performance of **any Foundation Model (FM)** deployed on **any AWS Generative AI servic...83 versions - Latest release: 8 days ago - 2.44 thousand downloads last month - 77 stars on GitHub - 1 maintainer
aphrodite-engine 0.6.5 💰
The inference engine for PygmalionAI models29 versions - Latest release: 4 months ago - 2.42 thousand downloads last month - 1,374 stars on GitHub - 1 maintainer
Related Keywords
trainium
22
tpu
22
rocm
22
cuda
22
inference
22
pytorch
21
model-serving
21
mlops
21
llmops
21
llm-serving
21
llm
21
llama
21
gpt
21
amd
21
transformer
21
xpu
21
deepseek
6
hpu
6
qwen
6
sagemaker
2
p4d
2
llama2
2
generative-ai
2
foundation-models
2
benchmarking
2
benchmark
2
bedrock
2
fine-tuning
1
aws
1
bring your own endpoint
1
llama3
1
ec2
1
eks
1
api-rest
1
inference-engine
1
intel
1
lora
1
machine-learning
1
speculative-decoding
1
mixed-precision training
1
diffusers
1
transformers
1
cicd
1
runpod
1
vllm
1
deployment
1
MLOps
1
NLP
1
LLM
1
server
1