Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

UQFF internal structure

The following describes the exact memory layout of UQFF tensors of version 0.1.0.

ToC

GGUF quantization

IDElement typeEndianness
UQFF versionu32little endian
ISQ type (0)u8little endian
Tensor data length in bytesu32little endian
Whether bias data is included (boolean)u8little endian
Quantized dtypeu32little endian
Num shape dimsu32little endian
Array quantized weight shape dimsu32little endian
Array quantized weight datau8little endian
[Optional] Array Bias tensor data, see docsSee docsSee docs

Unquantized layers

IDElement typeEndianness
UQFF versionu32little endian
ISQ type (1)u8little endian
Whether bias data is included (boolean)u8little endian
Array Weight tensor data, see docsSee docsSee docs
[Optional] Array Bias tensor data, see docsSee docsSee docs

FP8 layers

IDElement typeEndianness
UQFF versionu32little endian
ISQ type (1)u8little endian
Whether bias data is included (boolean)u8little endian
Array Weight tensor data, see docsSee docsSee docs
Dequant W scalarf32little endian
Dequant X scalarf32little endian
Quant scalarf32little endian
Quantization typeu32little endian
[Optional] Array Bias tensor data, see docsSee docsSee docs

HQQ quantization

IDElement typeEndianness
UQFF versionu32little endian
ISQ type (2)u8little endian
Whether bias data is included (boolean)u8little endian
Array Q weight, see docsSee docsSee docs
Array Q scale, see docsSee docsSee docs
Array Q zeroes, see docsSee docsSee docs
Dequant weight num shape dimsu32little endian
Array dequant weight shape dimsu32little endian
CFG bitsu8little endian
CFG group sizeu32little endian
CFG axisu8little endian
CFG optimization steps (0 means Option::None for now)u32little endian
CFG round zeroes (boolean)u8little endian
CFG channel wise (boolean)u8little endian

FP8 layers

IDElement typeEndianness
UQFF versionu32little endian
ISQ type (3)u8little endian
Whether bias data is included (boolean)u8little endian
Array Weight tensor data, see docsSee docsSee docs
Dequant scale Wf32little endian
Dequant scale Xf32little endian
Quant scalef32little endian
Layer dtypeu32little endian
[Optional] Array Bias tensor data, see docsSee docsSee docs

Standard tensors

IDElement typeEndianness
Tensor data length in bytesu32little endian
Tensor dtypeu32little endian
Num shape dimsu32little endian
Array shape dimsu32little endian
Array flattened (contiguous) tensor datau8little endian