Quantization types
ISQ (in-situ quantization) types supported by mistral.rs. For format selection guidance and underlying tradeoffs, see the quantization guide.
Flag choice for normal CLI usage:
--quant N- normal usage.--isq N- force runtime ISQ and skip the UQFF (Universal Quantized File Format) lookup.
Numeric shorthands
Section titled “Numeric shorthands”mistral.rs resolves N to a format based on the detected backend (see table). This happens when --quant falls back to runtime ISQ, or when you pass --isq N directly.
| Shorthand | Metal resolves to | CUDA / CPU resolves to |
|---|---|---|
2 | AFQ2 | Q2K |
3 | AFQ3 | Q3K |
4 | AFQ4 | Q4K |
5 | Q5K | Q5K |
6 | AFQ6 | Q6K |
8 | AFQ8 | Q8_0 |
Format-specific types
Section titled “Format-specific types”AFQ family
Section titled “AFQ family”Affine quantization, optimized for Apple Silicon. Runs on Metal (native kernels), CUDA (dedicated backend), and CPU (fallback).
| Type | Bits |
|---|---|
afq2 | 2 |
afq3 | 3 |
afq4 | 4 |
afq6 | 6 |
afq8 | 8 |
Q*K family
Section titled “Q*K family”GGML K-quant formats. Supported on all backends.
| Type | Bits |
|---|---|
q2k | 2 |
q3k | 3 |
q4k | 4 |
q5k | 5 |
q6k | 6 |
Legacy GGML types
Section titled “Legacy GGML types”Supported for GGUF compatibility:
| Type | Bits |
|---|---|
q4_0, q4_1 | 4 |
q5_0, q5_1 | 5 |
q8_0 | 8 |
E4M3 FP8. Native acceleration on NVIDIA Ada/Hopper (compute 8.9+); runs emulated elsewhere.
| Type | Bits | Layout |
|---|---|---|
fp8 | 8 | E4M3 (4-bit exponent, 3-bit mantissa) |
f8q8 | 8 | FP8 weights, INT8 activations |
4-bit microscaling format. Native on Blackwell; emulated elsewhere.
| Type | Bits |
|---|---|
mxfp4 | 4 |
Half-quadratic quantization.
| Type | Bits |
|---|---|
hqq4 | 4 |
hqq8 | 8 |
GPTQ and AWQ
Section titled “GPTQ and AWQ”Not ISQ types, pre-quantized formats. Load directly when a Hugging Face model is available as GPTQ or AWQ:
mistralrs run --format plain -m <gptq-or-awq-repo>mistral.rs detects the quantization from the model’s config. No --quant or --isq required.
See the quantization guide for format selection.