diffusion_rs_backend

Trait QuantMethod

source
pub trait QuantMethod:
    Send
    + Sync
    + Debug {
    // Required methods
    fn new(method: QuantMethodConfig) -> Result<Self>
       where Self: Sized;
    fn dequantize_w(&self, out_ty: DType) -> Result<Tensor>;
    fn forward(&self, a: &Tensor) -> Result<Tensor>;
    fn quantized_act_type(&self) -> Option<DType>;
    fn to_device(&self, dev: &Device) -> Result<Arc<dyn QuantMethod>>;
    fn device(&self) -> Device;
    fn size_in_bytes(&self) -> Result<usize>;

    // Provided methods
    fn forward_autocast(&self, a: &Tensor) -> Result<Tensor> { ... }
    fn forward_via_half(&self, a: &Tensor) -> Result<Tensor> { ... }
}
Expand description

Quantized method for a quantized matmul.

Required Methods§

source

fn new(method: QuantMethodConfig) -> Result<Self>
where Self: Sized,

source

fn dequantize_w(&self, out_ty: DType) -> Result<Tensor>

source

fn forward(&self, a: &Tensor) -> Result<Tensor>

Compute matmul of self and a. self should contain the weights.

source

fn quantized_act_type(&self) -> Option<DType>

If a quantized method, return the activation dtype.

source

fn to_device(&self, dev: &Device) -> Result<Arc<dyn QuantMethod>>

Cast this layer to the given device.

source

fn device(&self) -> Device

source

fn size_in_bytes(&self) -> Result<usize>

Provided Methods§

source

fn forward_autocast(&self, a: &Tensor) -> Result<Tensor>

Compute matmul of self and a. self should contain the weights. Automatically cast to required quantization actiation type and back

source

fn forward_via_half(&self, a: &Tensor) -> Result<Tensor>

Compute matmul of self and a. self should contain the weights. This may go via half precision if it is supported.

Trait Implementations§

source§

impl Module for dyn QuantMethod

source§

fn forward(&self, xs: &Tensor) -> Result<Tensor>

Implementors§