CLI reference

Subcommands

Subcommand	Purpose
`mistralrs serve`	Start HTTP/MCP server and (optionally) the UI at /ui
`mistralrs run`	Run model in interactive mode, or one-shot mode with `-i`
`mistralrs completions`	Generate shell completions
`mistralrs quantize`	Generate UQFF quantized model file
`mistralrs uqff`	Inspect, report, or verify UQFF artifacts
`mistralrs doctor`	Run system diagnostics and environment checks
`mistralrs tune`	Recommend quantization + device mapping for a model. Rejects `--quant auto`; pass `--quant <level>` or `--isq <level>` to bias the recommendation toward a specific quantization target
`mistralrs login`	Authenticate with HuggingFace Hub
`mistralrs cache`	Manage the HuggingFace model cache
`mistralrs bench`	Run performance benchmarks for plain model generation
`mistralrs from-config`	Run from a full TOML configuration file
`mistralrs update`	Update or migrate an install using the installer
`mistralrs uninstall`	Remove an installer-managed install

Option	Default	Description
`--seed <SEED>`		Random seed for reproducibility
`-l, --log <LOG>`		Log all requests and responses to this file
`--token-source <TOKEN_SOURCE>`	`cache`	Token source for HuggingFace authentication. Formats: `literal:<token>`, `env:<var>`, `path:<file>`, `cache`, `none`
`-v, --verbose`	`0`	Increase logging verbosity. Use -v for debug and -vv for trace-level internals