Reference
Reference pages are short, complete, and lookup-oriented. For narrative or motivation, see the Guides or Explanation sections.
Contents
Section titled “Contents”CLI. Every subcommand and flag of the mistralrs binary: run, serve, bench, tune, login, from-config, and the rest.
TOML configuration. The schema for the config file mistralrs from-config reads.
HTTP API. Endpoint-by-endpoint server documentation, with OpenAI-compatible, Anthropic-compatible, and mistral.rs-specific request and response schemas.
OpenAI compatibility. Which parts of OpenAI’s Chat Completions and Responses surface are implemented, and which are not. For setup, see OpenAI-compatible APIs.
Anthropic Messages API. The Anthropic-compatible Messages surface, including streaming, tool use, token counting, and server-side agent tools.
Python API. The public surface of the mistralrs Python package, generated from the type stub: Runner, Which, request and response types.
Rust API. Canonical reference at docs.rs/mistralrs.
MCP configuration schema. The JSON schema for MCP client configuration files.
Supported models. Every supported architecture, the modalities each accepts, and the compatible quantization methods.
Model notes. A short FAQ for models with non-standard behavior.
Cargo features. Build-from-source feature flags.
Environment variables. Every env var the binary or its build scripts read.
UQFF format. The on-disk binary layout of the UQFF quantization format.
Quantization types. Bit counts, hardware requirements, and relative quality per supported method.
Troubleshooting. Symptom-to-cause index for common errors.