Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Embeddings Overview

Mistral.rs can load embedding models alongside chat, vision, diffusion, and speech workloads. Embedding models produce dense vector representations that you can use for similarity search, clustering, reranking, and other semantic tasks.

Supported models

ModelNotesDocumentation
EmbeddingGemmaGoogle’s multilingual embedding model.EMBEDDINGGEMMA.md
Qwen3 EmbeddingQwen’s general-purpose embedding encoder.QWEN3_EMBEDDING.md

Have another embedding model you would like supported? Open an issue with the model ID and configuration.

Usage overview

  1. Choose a model from the table above.
  2. Load it through one of our APIs:
    • CLI/HTTP
    • Python
    • Rust

Detailed examples for each model live in their dedicated documentation pages.