Crate mistralrs_vision

source
Expand description

This crate provides vision utilities for mistral.rs inspired by torchvision. In particular, it represents transformations on some Self type which are applied sequentially.

§Example

use candle_core::Device;
use image::{ColorType, DynamicImage};
use mistralrs_vision::{ApplyTransforms, Normalize, ToTensor, Transforms};

let image = DynamicImage::new(3, 4, ColorType::Rgb8);
let transforms = Transforms {
    input: &ToTensor,
    inner_transforms: &[&Normalize {
        mean: vec![0.5, 0.5, 0.5],
        std: vec![0.5, 0.5, 0.5],
    }],
};
let transformed = image.apply(transforms, &Device::Cpu).unwrap();
assert_eq!(transformed.dims(), &[3, 4, 3]);

Structs§

  • Resize the image via nearest interpolation.
  • Normalize the image data based on the mean and standard deviation. The value is computed as follows: x[channel]=(x[channel - mean[channel]) / std[channel]
  • Multiply the pixe values by the provided factor.
  • Transforms, with each of inner_transforms applied sequentially
  • Convert an image to a tensor. This converts the data from being in [0, 255] to [0.0, 1.0]. The tensor’s shape is (channels, height, width).
  • Convert an image to a tensor without normalizing to [0.0, 1.0]. The tensor’s shape is (channels, height, width).
  • Transforms to apply, starting with the input and then with each transform in inner_transforms applied sequentially

Traits§

Functions§

  • Given the image sizes (h, w) and the minimum and maximum lengths, calculate the image dimensions which will preserve aspect ration while respecing the minimum and maximum lengths.
  • Generate pixel mask of shape (c, max_h, max_w). 1 indicates valid pixel, 0 indicates padding.
  • Pad an image of shape (c, h, w) to (c, max_h, max_w) by padding with zeros on the right and bottom.
  • Resize the images to the maximum edge length - preserving aspect ratioPad all the images with black padding.