Start here
Use this page to pick the first document to read. Most workflows start with auto-detection and add flags only when the model, hardware, or deployment requires them.
Choose by task
Section titled “Choose by task”| If you need to… | Start here | Then read |
|---|---|---|
| Chat with a model on one machine | Your first model | Pick a quantization method |
| Verify install, GPU support, or Hugging Face access | Your first model | Troubleshooting |
| Expose an OpenAI-compatible endpoint | Serve a model as an API | Configure the HTTP server |
| Use the built-in browser UI | Serve a model as an API | Use the built-in web UI |
| Call mistral.rs from Python in-process | Call a model from Python | Python API reference |
| Embed mistral.rs in Rust | Call a model from Rust | Rust API on docs.rs |
| Build a local agent app with tools, code execution, web search, multimodal inputs, or session state | Build an agent | Agentic runtime for apps |
| Fit a larger model on the same hardware | Quantize a model | Auto-tune with mistralrs tune |
| Split a model across GPUs or machines | Performance | Split a model across multiple GPUs |
| Run a server for real traffic | Run mistralrs in Docker | Production checklist |
Choose by runtime mode
Section titled “Choose by runtime mode”| Mode | Use when | Entry point |
|---|---|---|
| CLI | You want local interactive use, quick tests, or benchmarking. | mistralrs run, mistralrs bench, mistralrs tune |
| HTTP server | You want OpenAI-compatible clients, a web UI, or a process boundary around inference. | mistralrs serve |
| Config file | You need repeatable multi-model startup or a deployment config checked into source control. | mistralrs from-config -f config.toml |
| Diagnostics | You want to check hardware detection, build features, or Hugging Face connectivity. | mistralrs doctor |
| Python package | You want in-process access from Python without running a server. | mistralrs.Runner |
| Rust crate | You want inference embedded inside a Rust service. | mistralrs crate |
If unsure
Section titled “If unsure”Start with Your first model, then Serve a model as an API. Those two pages exercise the default local and server paths and make later choices easier to evaluate.