pub fn fp8_blockwise_dequantize(
weight: &Tensor,
inv_scales: &Tensor,
weight_block_size: Vec<usize>,
out_ty: DType,
) -> Result<Tensor>
Expand description
FP8 blockwise dequantize.
- Expects weight to be fp8
- Expects inv_scales to be f32
- weight * inv_scale = dequantized