Python API
The mistralrs Python package exposes the same engine that powers the mistralrs CLI.
Install
Section titled “Install”One wheel per accelerator. All wheels expose the same mistralrs module.
| Accelerator | Package |
|---|---|
| CPU (or Intel CPU with MKL) | pip install mistralrs |
| NVIDIA GPU | pip install mistralrs-cuda |
| Apple Silicon | pip install mistralrs-metal |
| Intel MKL (pinned) | pip install mistralrs-mkl |
| macOS Accelerate | pip install mistralrs-accelerate |
| Page | Covers |
|---|---|
| Runner | The main entry point. Load a model and send requests. |
| Which | Variants that select which kind of model to load. |
| Requests | Request dataclasses passed to Runner methods. |
| Responses | Response and streaming types returned by the engine. |
| Enums | Architecture, dtype, and option enums. |
| Search | Types for web-search tool configuration. |
| AnyMoE | AnyMoE expert and config types. |
| Code execution | Configuration for the built-in Python code executor. |
| Agent approvals | Request and decision types for agent action approval callbacks. |
| Files | First-class output files surfaced from agentic runs. |
| MCP | MCP client configuration types. |
| Auto-mapping | Hints for automatic device mapping. |
See Tutorial 3 for a walkthrough and the Python guides for task-oriented recipes.
Generated from mistralrs-pyo3/mistralrs.pyi.