pypi.org "quantization" keyword
Top 1.8% on pypi.org
44 versions - Latest release: about 2 months ago - 15 dependent packages - 281 dependent repositories - 25.1 thousand downloads last month - 2,212 stars on GitHub - 1 maintainer
neural-compressor 3.7.1
Repository of Intel® Neural Compressor44 versions - Latest release: about 2 months ago - 15 dependent packages - 281 dependent repositories - 25.1 thousand downloads last month - 2,212 stars on GitHub - 1 maintainer
deepsparse-ent 1.9.0
[DEPRECATED] An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML...15 versions - Latest release: 9 months ago - 2 dependent packages - 1.31 thousand downloads last month - 3,158 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
84 versions - Latest release: 3 months ago - 82 dependent packages - 715 dependent repositories - 1.44 million downloads last month - 2,542 stars on GitHub - 7 maintainers
optimum 2.1.0
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework t...84 versions - Latest release: 3 months ago - 82 dependent packages - 715 dependent repositories - 1.44 million downloads last month - 2,542 stars on GitHub - 7 maintainers
Top 1.9% on pypi.org
115 versions - Latest release: about 1 month ago - 32 dependent packages - 48 dependent repositories - 4.74 million downloads last month - 3,362 stars on GitHub - 3 maintainers
ctranslate2 4.7.1
Fast inference engine for Transformer models115 versions - Latest release: about 1 month ago - 32 dependent packages - 48 dependent repositories - 4.74 million downloads last month - 3,362 stars on GitHub - 3 maintainers
Top 1.1% on pypi.org
64 versions - Latest release: 23 days ago - 240 dependent packages - 3,758 dependent repositories - 5.08 million downloads last month - 5,849 stars on GitHub - 2 maintainers
bitsandbytes 0.49.2
k-bit optimizers and matrix multiplication routines.64 versions - Latest release: 23 days ago - 240 dependent packages - 3,758 dependent repositories - 5.08 million downloads last month - 5,849 stars on GitHub - 2 maintainers
quantization-rs 0.6.0
Neural network quantization toolkit for ONNX models4 versions - Latest release: 20 days ago - 98 downloads last month - 1 maintainer
quicksilver-inference 0.2.2
High-performance GGUF model inference with quantized kernels4 versions - Latest release: about 1 month ago - 1 maintainer
sigqnn 0.0.1.dev0
Quantized Neural Networks for Signal Processing (PLACEHOLDER - NOT FUNCTIONAL)1 version - Latest release: about 1 month ago - 102 downloads last month - 1 maintainer
mps-bitsandbytes 0.7.0
NF4/FP4/FP8/INT8 quantization for PyTorch on Apple Silicon with Metal GPU acceleration20 versions - Latest release: about 1 month ago - 1 maintainer
Top 1.6% on pypi.org
21 versions - Latest release: 4 months ago - 53 dependent packages - 35 dependent repositories - 3.44 million downloads last month - 9,301 stars on GitHub - 2 maintainers
faster-whisper 1.2.1
Faster Whisper transcription with CTranslate221 versions - Latest release: 4 months ago - 53 dependent packages - 35 dependent repositories - 3.44 million downloads last month - 9,301 stars on GitHub - 2 maintainers
gguf-llama 0.0.18
Wrapper for simplified use of Llama2 GGUF quantized models.9 versions - Latest release: about 2 years ago - 1 dependent package - 124 downloads last month - 5 stars on GitHub - 1 maintainer
Top 3.9% on pypi.org
1 version - Latest release: over 4 years ago - 2 dependent packages - 22 dependent repositories - 1.85 thousand downloads last month - 579 stars on GitHub - 1 maintainer
qkeras 0.9.0
Quantization package for Keras1 version - Latest release: over 4 years ago - 2 dependent packages - 22 dependent repositories - 1.85 thousand downloads last month - 579 stars on GitHub - 1 maintainer
llamafactory 0.9.4
Unified Efficient Fine-Tuning of 100+ LLMs10 versions - Latest release: 2 months ago - 20.8 thousand downloads last month - 30,120 stars on GitHub - 1 maintainer
Top 1.3% on pypi.org
31 versions - Latest release: about 2 years ago - 16 dependent packages - 882 dependent repositories - 440 thousand downloads last month - 1,472 stars on GitHub - 1 maintainer
tensorflow-model-optimization 0.8.0
A suite of tools that users, both novice and advanced can use to optimize machine learning models...31 versions - Latest release: about 2 years ago - 16 dependent packages - 882 dependent repositories - 440 thousand downloads last month - 1,472 stars on GitHub - 1 maintainer
glai 0.1.3
Easy deployment of quantized llama models on cpu11 versions - Latest release: about 2 years ago - 52 downloads last month - 6 stars on GitHub - 1 maintainer
locollm 0.2.0
Local Collaborative LLMs -- Frontier AI on a Student Budget1 version - Latest release: 5 days ago - 1 maintainer
helix-substrate 0.2.1
Model weight compression and streaming decode library3 versions - Latest release: 6 days ago - 226 downloads last month - 1 maintainer
torchao 0.16.0
Package for applying ao techniques to GPU models22 versions - Latest release: 29 days ago - 1 dependent package - 2.02 million downloads last month - 2,710 stars on GitHub - 7 maintainers
Top 9.0% on pypi.org
63 versions - Latest release: 3 months ago - 5 dependent packages - 1 dependent repositories - 118 thousand downloads last month - 344 stars on GitHub - 3 maintainers
optimum-intel 1.27.0
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework t...63 versions - Latest release: 3 months ago - 5 dependent packages - 1 dependent repositories - 118 thousand downloads last month - 344 stars on GitHub - 3 maintainers
Top 4.0% on pypi.org
593 versions - Latest release: 4 days ago - 6 dependent packages - 4 dependent repositories - 884 thousand downloads last month - 928 stars on GitHub - 1 maintainer
onnx2tf 2.3.0 💰
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The pur...593 versions - Latest release: 4 days ago - 6 dependent packages - 4 dependent repositories - 884 thousand downloads last month - 928 stars on GitHub - 1 maintainer
abu-quant 0.0.1
股票量化1 version - Latest release: almost 9 years ago - 1 dependent repositories - 106 downloads last month - 1 maintainer
optimum-furiosa 0.1.0
Optimum Furiosa is the interface between the 🤗 Transformers library and Furiosa NPUs such as Furi...1 version - Latest release: over 2 years ago - 1 dependent package - 9 downloads last month - 26 stars on GitHub - 3 maintainers
xturing 0.1.8
Fine-tuning, evaluation and data generation for LLMs19 versions - Latest release: over 2 years ago - 122 downloads last month - 2,665 stars on GitHub - 1 maintainer
cizm 0.0.1
Compression Image Models: compression utilities for PyTorch1 version - Latest release: 8 days ago - 120 downloads last month - 1 maintainer
gguf-modeldb 0.0.3
A Llama2 quantized gguf model db with over 80 preconfigured models downloadable in one line, easl...8 versions - Latest release: about 2 years ago - 1 dependent package - 111 downloads last month - 11 stars on GitHub - 1 maintainer
jsonvice 1.0.2
jsonvice minifies JSON files by trimming floating point precision.4 versions - Latest release: over 1 year ago - 1 dependent repositories - 44 downloads last month - 2 stars on GitHub - 1 maintainer
Top 4.1% on pypi.org
10 versions - Latest release: over 1 year ago - 3 dependent packages - 10 dependent repositories - 12.1 thousand downloads last month - 0 stars on GitHub - 1 maintainer
pngquant-cli 3.0.3
Precompiled binaries for pngquant, the lossy PNG compressor based on libimagequant.10 versions - Latest release: over 1 year ago - 3 dependent packages - 10 dependent repositories - 12.1 thousand downloads last month - 0 stars on GitHub - 1 maintainer
Top 8.9% on pypi.org
341 versions - Latest release: about 2 years ago - 1 dependent repositories - 525 downloads last month - 1,557 stars on GitHub - 1 maintainer
tf-model-optimization-nightly 0.8.0.dev2024021403
A suite of tools that users, both novice and advanced can use to optimize machine learning models...341 versions - Latest release: about 2 years ago - 1 dependent repositories - 525 downloads last month - 1,557 stars on GitHub - 1 maintainer
optimum-onnx 0.1.0
Optimum ONNX is an interface between the Hugging Face libraries and ONNX / ONNX Runtime4 versions - Latest release: 3 months ago - 317 thousand downloads last month - 118 stars on GitHub - 2 maintainers
llmcompressor-nightly 0.5.0.20250410
A library for compressing large language models utilizing the latest techniques and research in t...30 versions - Latest release: 11 months ago - 103 downloads last month - 1 maintainer
ctranslate2-arty 4.6.2
Fast inference engine for Transformer models1 version - Latest release: 3 months ago - 3.54 thousand downloads last month - 1 maintainer
Top 3.5% on pypi.org
42 versions - Latest release: 9 months ago - 3 dependent packages - 6 dependent repositories - 1.92 thousand downloads last month - 2,664 stars on GitHub - 1 maintainer
deepsparse 1.9.0
[DEPRECATED] An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML...42 versions - Latest release: 9 months ago - 3 dependent packages - 6 dependent repositories - 1.92 thousand downloads last month - 2,664 stars on GitHub - 1 maintainer
fms-model-optimizer 0.8.1
Quantization Techniques11 versions - Latest release: 16 days ago - 6.79 thousand downloads last month - 20 stars on GitHub - 1 maintainer
qattn 0.1.1
Efficient GPU Kernels in Triton for Quantized Vision Transformers3 versions - Latest release: over 1 year ago - 13 downloads last month - 15 stars on GitHub - 1 maintainer
klora 0.1.0
High-Rank Kronecker LoRA for 2-bit Model Fine-tuning1 version - Latest release: 9 days ago - 1 maintainer
convert-to-quant 1.1.2
Convert safetensors weights to quantized formats (FP8, INT8) with learned rounding optimization10 versions - Latest release: 18 days ago - 736 downloads last month - 1 maintainer
mblt-model-zoo 1.0.0
A codebase for pre-quantized AI models for Mobilint NPUs.14 versions - Latest release: 12 days ago - 476 downloads last month - 15 stars on GitHub - 1 maintainer
bitneural32 0.0.15
BitNeural32: 1.58-bit Ternary Neural Network Compiler & QAT Library for ESP3215 versions - Latest release: 22 days ago - 839 downloads last month - 1 maintainer
bitsandbytes-cuda116 0.26.0.post2
8-bit optimizers and quantization routines.1 version - Latest release: over 3 years ago - 1 dependent repositories - 1.19 thousand downloads last month - 136 stars on GitHub - 1 maintainer
neural-compressor-full 2.1.1
Repository of Intel® Neural Compressor9 versions - Latest release: almost 3 years ago - 100 downloads last month - 2,512 stars on GitHub - 1 maintainer
Top 7.4% on pypi.org
13 versions - Latest release: 24 days ago - 853 downloads last month - 1 maintainer
oprel 0.3.5
Oprel is a high-performance Python library for running large language models locally. It provides...13 versions - Latest release: 24 days ago - 853 downloads last month - 1 maintainer
picollm 2.0.1
picoLLM Inference Engine12 versions - Latest release: 26 days ago - 605 downloads last month - 154 stars on GitHub - 1 maintainer
picollmdemo 2.0.0
picoLLM Inference Engine demos11 versions - Latest release: 3 months ago - 64 downloads last month - 154 stars on GitHub - 1 maintainer
onnx-neural-compressor 1.0
Repository of Neural Compressor ORT1 version - Latest release: over 1 year ago - 44 downloads last month - 72 stars on GitHub - 1 maintainer
owlite 0.0.8
A fake package to warn the user they are not installing the correct package.6 versions - Latest release: over 1 year ago - 14 downloads last month - 50 stars on GitHub - 2 maintainers
llama-memory 0.0.1a1
Easy deployment of quantized llama models on cpu1 version - Latest release: about 2 years ago - 15 downloads last month - 1 stars on GitHub - 1 maintainer
canirun 1.0.1
Check if you can run a Hugging Face model locally.3 versions - Latest release: about 2 months ago - 300 downloads last month - 1 maintainer
Top 4.8% on pypi.org
19 versions - Latest release: 6 months ago - 2 dependent packages - 4 dependent repositories - 32 thousand downloads last month - 1,393 stars on GitHub - 4 maintainers
brevitas 0.12.1
Quantization-aware training in PyTorch19 versions - Latest release: 6 months ago - 2 dependent packages - 4 dependent repositories - 32 thousand downloads last month - 1,393 stars on GitHub - 4 maintainers
nerfprobe 0.2.0
Scientifically-grounded LLM degradation detection for developers2 versions - Latest release: 3 months ago - 1 maintainer
isage-amms 0.1.0
Approximate Matrix Multiplication algorithms with unified interface1 version - Latest release: 12 days ago - 1 maintainer
gptqmodel 5.7.0
Production ready LLM model compression/quantization toolkit with hw accelerated inference support...51 versions - Latest release: 28 days ago - 28.9 thousand downloads last month - 922 stars on GitHub - 1 maintainer
Top 5.4% on pypi.org
18 versions - Latest release: over 2 years ago - 1 dependent package - 7 dependent repositories - 557 downloads last month - 87 stars on GitHub - 2 maintainers
optimum-graphcore 0.7.1
Optimum Library is an extension of the Hugging Face Transformers library, providing a framework t...18 versions - Latest release: over 2 years ago - 1 dependent package - 7 dependent repositories - 557 downloads last month - 87 stars on GitHub - 2 maintainers
Top 4.2% on pypi.org
9 versions - Latest release: over 5 years ago - 3 dependent packages - 50 dependent repositories - 29.7 thousand downloads last month - 202 stars on GitHub - 1 maintainer
navec 0.10.0
Compact high quality word embeddings for russian language9 versions - Latest release: over 5 years ago - 3 dependent packages - 50 dependent repositories - 29.7 thousand downloads last month - 202 stars on GitHub - 1 maintainer
bitsandbytes-cuda114 0.26.0.post2
8-bit optimizers and quantization routines.1 version - Latest release: over 3 years ago - 79 downloads last month - 136 stars on GitHub - 1 maintainer
oomllama 0.7.0
Efficient LLM inference with .oom format - 2x smaller than GGUF13 versions - Latest release: 14 days ago - 643 downloads last month - 1 maintainer
Top 9.4% on pypi.org
1 version - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 867 downloads last month - 136 stars on GitHub - 1 maintainer
bitsandbytes-cuda117 0.26.0.post2
8-bit optimizers and quantization routines.1 version - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 867 downloads last month - 136 stars on GitHub - 1 maintainer
isagellm-compression 0.1.0
Model Compression & Acceleration Module for sageLLM41 versions - Latest release: about 2 months ago - 3.47 thousand downloads last month
rotalabs-accel 1.0.0
High-performance inference acceleration with Triton kernels, quantization, and speculative decoding4 versions - Latest release: about 1 month ago - 505 downloads last month - 1 maintainer
zllm-zse 0.1.2
ZSE - Z Server Engine: Ultra memory-efficient LLM inference engine3 versions - Latest release: 14 days ago - 1 maintainer
tlex-edge 0.1.0
T-LEX: Edge-Decoupled LLM Inference - Run 32B models on ANY device!1 version - Latest release: 14 days ago - 1 maintainer
hyper-aidev 0.1.1
A Python library to simplify model learning, training, and creation for powerful AI models across...2 versions - Latest release: 9 months ago - 13 downloads last month - 1 maintainer
genoarmory 0.1.0
A DNA sequence Adversial attack and defense benchmark1 version - Latest release: 10 months ago - 9 downloads last month - 37 stars on GitHub - 1 maintainer
pyzenith 0.3.5
Cross-Platform ML Optimization Framework with ONNX Interpreter18 versions - Latest release: 2 months ago - 313 downloads last month - 0 stars on GitHub - 1 maintainer
model-quantizer 0.3.3
A tool for quantizing large language models6 versions - Latest release: 12 months ago - 84 downloads last month - 2 stars on GitHub - 1 maintainer
bnn 0.1.2
Binarize deep convolutional neural networks using python and pytorch3 versions - Latest release: over 4 years ago - 1 dependent repositories - 46 downloads last month - 148 stars on GitHub - 1 maintainer
llm-insight-forge 0.1.2
A comprehensive toolkit for LLM evaluation, prompt engineering, fine-tuning, and inference optimi...3 versions - Latest release: 10 months ago - 37 downloads last month - 0 stars on GitHub - 1 maintainer
mobius-faster-whisper 1.1.1
Mobius Version of Faster Whisper transcription with CTranslate22 versions - Latest release: over 1 year ago - 690 downloads last month - 24 stars on GitHub - 2 maintainers
clika-client 0.0.2
A fake package to warn the user they are not installing the correct package.3 versions - Latest release: about 2 years ago - 18 downloads last month - 1 maintainer
llmcompressor 0.9.0
A library for compressing large language models utilizing the latest techniques and research in t...52 versions - Latest release: 3 months ago - 123 thousand downloads last month
sinq 0.2.0
SINQ quantization I/O for Hugging Face Transformers9 versions - Latest release: 19 days ago - 296 downloads last month - 1 maintainer
autopack-grn 0.1.3
CLI to quantize and release Hugging Face models in multiple formats8 versions - Latest release: 6 months ago - 205 downloads last month - 1 maintainer
q-galore-torch 1.0
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients1 version - Latest release: over 1 year ago - 113 downloads last month - 201 stars on GitHub - 1 maintainer
quanto 0.2.0
A quantization toolkit for pytorch.15 versions - Latest release: almost 2 years ago - 4.02 thousand downloads last month - 1,020 stars on GitHub - 1 maintainer
hnn 0.0.3
A programming framework based on PyTorch for hybrid neural networks with automatic quantization8 versions - Latest release: about 3 years ago - 1 dependent repositories - 19 downloads last month - 19 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
10 versions - Latest release: almost 6 years ago - 1 dependent repositories - 390 downloads last month - 2,940 stars on GitHub - 1 maintainer
nlp-architect 0.5.4
Intel AI Lab NLP and NLU research model library10 versions - Latest release: almost 6 years ago - 1 dependent repositories - 390 downloads last month - 2,940 stars on GitHub - 1 maintainer
ctm-ai 0.0.2
A cognitive architecture motivated by consciousness turing machine.2 versions - Latest release: about 1 year ago - 25 downloads last month - 48,256 stars on GitHub - 1 maintainer
nano-rust-py 0.2.0
TinyML inference engine for embedded devices — Rust no_std core with Python bindings and quantiza...2 versions - Latest release: 19 days ago - 154 downloads last month - 1 maintainer
gguf2oom 0.1.0
Convert GGUF models to OomLlama's compact OOM format - 2x smaller1 version - Latest release: 18 days ago - 101 downloads last month - 1 maintainer
neural-compressor-3x-ort 2.5.1
Repository of Intel® Neural Compressor2 versions - Latest release: almost 2 years ago - 1 dependent package - 32 downloads last month - 2,512 stars on GitHub - 1 maintainer
databalancer 0.2.0
Databalancer is the python library dedicated to balance the imbalanced text classification datase...21 versions - Latest release: over 3 years ago - 1 dependent repositories - 16 downloads last month - 7 stars on GitHub - 1 maintainer
sparsify 1.7.0
[DEPRECATED] Easy-to-use UI for automatically sparsifying neural networks and creating sparsifica...25 versions - Latest release: 10 months ago - 1 dependent repositories - 246 downloads last month - 326 stars on GitHub - 1 maintainer
aimet-torch 2.24.0
AIMET torch package38 versions - Latest release: 29 days ago - 14.8 thousand downloads last month - 6 stars on GitHub - 4 maintainers
nqlib 1.0.1
NQLib: Library to design noise shaping quantizer for discrete-valued input control.10 versions - Latest release: 3 months ago - 1 dependent repositories - 41 downloads last month - 5 stars on GitHub - 1 maintainer
textembedcompress 0.3.0
A Python package to compress text embeddings using various dimensionality reduction and quantizat...2 versions - Latest release: 9 months ago - 14 downloads last month - 1 maintainer
Top 1.9% on pypi.org
287 versions - Latest release: about 1 year ago - 26 dependent packages - 120 dependent repositories - 118 thousand downloads last month - 2,554 stars on GitHub - 1 maintainer
vector-quantize-pytorch 1.21.2
Vector Quantization - Pytorch287 versions - Latest release: about 1 year ago - 26 dependent packages - 120 dependent repositories - 118 thousand downloads last month - 2,554 stars on GitHub - 1 maintainer
nunchaku-chroma 0.1.3
Chroma model support for Nunchaku - quantized inference for diffusion models4 versions - Latest release: 3 months ago - 105 downloads last month - 1 maintainer
mct-nightly 0.0.0
A Model Compression Toolkit for neural networks1,171 versions - Latest release: over 4 years ago - 1 dependent repositories - 40.2 thousand downloads last month - 342 stars on GitHub - 1 maintainer
bitsandbytes-cuda115 0.26.0.post2
8-bit optimizers and quantization routines.1 version - Latest release: over 3 years ago - 40 downloads last month - 136 stars on GitHub - 1 maintainer
lixinger-openapi 1.0.2
lixinger openapi11 versions - Latest release: almost 7 years ago - 1 dependent repositories - 113 downloads last month - 56 stars on GitHub - 1 maintainer
bitnet-v3 1.0.1
BitNet v3: Ultra-Low Quality Loss 1-bit LLMs Through Multi-Stage Progressive Quantization and Ada...2 versions - Latest release: 9 months ago - 42 downloads last month - 4 stars on GitHub - 1 maintainer
tinyedgellm 0.1.0
A modular framework for LLM quantization, structured pruning, and edge deployment1 version - Latest release: 5 months ago - 9 downloads last month - 0 stars on GitHub - 1 maintainer
llamafactory-songlab 0.9.1.dev0
Easy-to-use LLM fine-tuning framework1 version - Latest release: over 1 year ago - 17 downloads last month - 64,347 stars on GitHub - 1 maintainer
invarlock 0.3.11
Edit‑agnostic robustness evaluation reports for weight edits (InvarLock framework)13 versions - Latest release: 26 days ago - 602 downloads last month - 0 stars on GitHub - 1 maintainer
gptq-triton 0.0.4
Fast GPTQ kernels written in Triton2 versions - Latest release: almost 3 years ago - 27 downloads last month - 307 stars on GitHub - 1 maintainer
swiss-army-keras 0.10.0
A collection of models and utilities for the development of edge deployable Keras models52 versions - Latest release: about 3 years ago - 1 dependent repositories - 89 downloads last month - 0 stars on GitHub - 1 maintainer
angelslim 0.3.0
A toolkit for compress llm model.6 versions - Latest release: about 2 months ago - 6.06 thousand downloads last month - 132 stars on GitHub - 1 maintainer
sparsify-nightly 0.0.1
ML model optimization product to accelerate inference.142 versions - Latest release: 8 months ago - 1 dependent repositories - 281 downloads last month - 327 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
43 versions - Latest release: 9 months ago - 4 dependent packages - 20 dependent repositories - 1.06 thousand downloads last month - 2,146 stars on GitHub - 1 maintainer
sparseml 1.9.0
[DEPRECATED] Libraries for applying sparsification recipes to neural networks with a few lines of...43 versions - Latest release: 9 months ago - 4 dependent packages - 20 dependent repositories - 1.06 thousand downloads last month - 2,146 stars on GitHub - 1 maintainer
topai-faster-whisper 1.0.4
Faster Whisper transcription with CTranslate25 versions - Latest release: over 1 year ago - 44 downloads last month - 18,751 stars on GitHub - 1 maintainer
clika-ace 0.0.2
A fake package to warn the user they are not installing the correct package.3 versions - Latest release: about 2 years ago - 28 downloads last month - 1 maintainer
Related Keywords
pytorch
57
llm
52
pruning
51
inference
48
compression
43
deep-learning
42
machine-learning
41
transformers
40
optimization
30
nlp
27
onnx
27
sparsity
26
transformer
25
gpu
23
large-language-models
20
ai
17
python
17
llama
17
awq
17
machine learning
16
8-bit
16
knowledge-distillation
16
optimizers
15
gptq
15
deep learning
15
model-compression
14
LLM
14
gguf
14
quantization-aware-training
14
int4
14
tensorflow
13
neural-network
13
auto-tuning
12
cuda
12
post-training static quantization
12
post-training dynamic quantization
12
quantization-aware training
12
int8
12
post-training-quantization
12
huggingface
12
smoothquant
11
low-precision
11
fp4
11
computer-vision
11
sparsegpt
10
mxformat
10
fine-tuning
10
torch
9
artificial intelligence
9
lora
9
keras
9
llm-inference
9
training
9
object-detection
9
openai
8
llama3
8
cpu
8
mistral
8
fpga
8
peft
8
qwen
8
openvino
8
edge-ai
8
computer vision
7
neural network
7
sparsification
7
deep-neural-networks
7
language-model
7
intel
7
qlora
7
whisper
7
network-quantization
6
auto-around
6
pretrained-models
6
generative-ai
6
agent
6
gpt
6
llms
6
moe
6
model
6
compiler
6
onnxruntime
6
rounding
5
neural networks
5
dataflow
5
models
5
network-compression
5
speech
5
ctranslate2
5
speech-recognition
5
speech-to-text
5
smaller-models
5
deployment
5
rlhf
5
transfer-learning
5
translation
5
instruction-tuning
5
avx2
5
edge
5
evaluation
5