Skip to content

Python API

The mistralrs Python package exposes the same engine that powers the mistralrs CLI.

One wheel per accelerator. All wheels expose the same mistralrs module.

AcceleratorPackage
CPU (or Intel CPU with MKL)pip install mistralrs
NVIDIA GPUpip install mistralrs-cuda
Apple Siliconpip install mistralrs-metal
Intel MKL (pinned)pip install mistralrs-mkl
macOS Acceleratepip install mistralrs-accelerate
PageCovers
RunnerThe main entry point. Load a model and send requests.
WhichVariants that select which kind of model to load.
RequestsRequest dataclasses passed to Runner methods.
ResponsesResponse and streaming types returned by the engine.
EnumsArchitecture, dtype, and option enums.
SearchTypes for web-search tool configuration.
AnyMoEAnyMoE expert and config types.
Code executionConfiguration for the built-in Python code executor.
Agent approvalsRequest and decision types for agent action approval callbacks.
FilesFirst-class output files surfaced from agentic runs.
MCPMCP client configuration types.
Auto-mappingHints for automatic device mapping.

See Tutorial 3 for a walkthrough and the Python guides for task-oriented recipes.


Generated from mistralrs-pyo3/mistralrs.pyi.