Skip to content

CLI reference

SubcommandPurpose
mistralrs serveStart HTTP/MCP server and (optionally) the UI at /ui
mistralrs runRun model in interactive mode, or one-shot mode with -i
mistralrs completionsGenerate shell completions
mistralrs quantizeGenerate UQFF quantized model file
mistralrs uqffInspect, report, or verify UQFF artifacts
mistralrs doctorRun system diagnostics and environment checks
mistralrs tuneRecommend quantization + device mapping for a model. Rejects --quant auto; pass --quant <level> or --isq <level> to bias the recommendation toward a specific quantization target
mistralrs loginAuthenticate with HuggingFace Hub
mistralrs cacheManage the HuggingFace model cache
mistralrs benchRun performance benchmarks for plain model generation
mistralrs from-configRun from a full TOML configuration file
mistralrs updateUpdate or migrate an install using the installer
mistralrs uninstallRemove an installer-managed install
OptionDefaultDescription
--seed <SEED>Random seed for reproducibility
-l, --log <LOG>Log all requests and responses to this file
--token-source <TOKEN_SOURCE>cacheToken source for HuggingFace authentication. Formats: literal:<token>, env:<var>, path:<file>, cache, none
-v, --verbose0Increase logging verbosity. Use -v for debug and -vv for trace-level internals