Embeddings Overview
Mistral.rs can load embedding models alongside chat, vision, diffusion, and speech workloads. Embedding models produce dense vector representations that you can use for similarity search, clustering, reranking, and other semantic tasks.
Supported models
| Model | Notes | Documentation |
|---|---|---|
| EmbeddingGemma | Google’s multilingual embedding model. | EMBEDDINGGEMMA.md |
| Qwen3 Embedding | Qwen’s general-purpose embedding encoder. | QWEN3_EMBEDDING.md |
Have another embedding model you would like supported? Open an issue with the model ID and configuration.
Usage overview
- Choose a model from the table above.
- Load it through one of our APIs:
- CLI/HTTP
- Python
- Rust
Detailed examples for each model live in their dedicated documentation pages.