Skip to content

Troubleshooting

Before debugging setup issues, run mistralrs doctor. It reports detected hardware, compiled accelerator features, and Hugging Face connectivity.

For unlisted issues, file an issue on GitHub with a reproducer.

mistralrs: command not found after install

Section titled “mistralrs: command not found after install”

The binary is at ~/.cargo/bin/mistralrs. The directory is added to PATH by rustup, but the change does not apply to the current shell. Open a new shell or run source "$HOME/.cargo/env".

Build fails with flash-attn feature enabled

Section titled “Build fails with flash-attn feature enabled”

Flash attention requires compute capability 8.0+. On older GPUs, drop flash-attn from features and rebuild with cuda cudnn.

The token must start with hf_. The validation happens in mistralrs login before saving.

Gated repository (Gemma, LLaMA, FLUX.1-dev, etc.)

Section titled “Gated repository (Gemma, LLaMA, FLUX.1-dev, etc.)”

Accept the license on the model’s Hugging Face page, then save a token with mistralrs login. The token is stored at ~/.cache/huggingface/token (or $HF_HOME/token).

Add --quant 4. If still too large, try --quant 2 or split across GPUs with -n "0:N1;1:N2;...".

Verify accelerator features are compiled in with mistralrs doctor. If cuda is missing, the binary was built without GPU support.

max_tokens is most likely too low. Check finish_reason, length means the token limit; stop means a stop sequence matched.

Check the Server listening on http://... line in the server output to confirm host and port.

The default allows any origin. Custom CORS configuration is only available programmatically through MistralRsServerRouterBuilder.

The default body limit is 50 MB and is not configurable via the CLI. Configure programmatically through MistralRsServerRouterBuilder.

The UI is on by default. Check that --no-ui was not passed at startup, and that no reverse proxy is rewriting /ui.

The session expired (30-minute idle TTL) or was evicted (128-session cap, LRU). Long-lived sessions need explicit export/import via /v1/sessions/{id}.

from mistralrs import Runner fails with ImportError

Section titled “from mistralrs import Runner fails with ImportError”

The wrong wheel was installed. Reinstall with the matching variant: mistralrs-cuda for NVIDIA, mistralrs-metal for Apple Silicon, mistralrs for CPU/MKL.

ModelBuilder::build() requires a tokio runtime

Section titled “ModelBuilder::build() requires a tokio runtime”

The SDK requires a running tokio runtime. Use #[tokio::main] or create a runtime with tokio::runtime::Runtime::new().