Skip to content

Guides

Guides answer “how do I…” questions. They assume mistral.rs is installed. Otherwise, start with the Tutorials.

If you need to…Start here
Install for a specific platform or deployment targetInstall and deploy
Run an HTTP server or web UIServe models
Reduce memory use or improve throughputPerformance
Add tools, search, code execution, or MCPBuild agents
Use the Python packagePython SDK
Use the Rust crateRust SDK
Work with vision, speech, image generation, or embeddingsModel types
Change model behavior or load adaptersCustomize
  • Install and deploy: platform-specific install steps, Docker images, and pre-production checks.
  • Serve models: HTTP server configuration, multi-model serving, the web UI, and the OpenAI Responses API surface.
  • Performance: quantization selection, the tune command, Flash and Paged attention, and multi-GPU or multi-machine splits.
  • Build agents: tool calling, code execution, web search, MCP, and persistent sessions.
  • Python SDK: streaming completions, image and video input, and the multi-turn session API.
  • Rust SDK: streaming and embedding mistral.rs in an Axum application.
  • Model types: vision input, image generation, speech, and embedding models.
  • Customize: LoRA adapters, AnyMoE, MatFormer, sampling parameters, and TOML config.