Run across multiple machines
The ring backend is a distributed transport selected by RING_CONFIG. It is separate from multi-node NCCL inference, which uses MISTRALRS_MN_* variables and NCCL across all ranks.
Use this page when you explicitly want the ring backend.
The ring feature must be compiled in:
cargo install --path mistralrs-cli --features "cuda flash-attn ring"If the binary is also built with nccl, set MISTRALRS_NO_NCCL=1 when launching so Comm::from_device selects the ring backend.
Configuration
Section titled “Configuration”The ring backend reads its configuration from a JSON file pointed to by the RING_CONFIG environment variable. Each participant has its own RING_CONFIG with rank-specific values.
Config shape:
{ "master_ip": "10.0.0.1", "master_port": 9000, "port": 9001, "right_port": 9002, "right_ip": "10.0.0.2", "rank": 0, "world_size": 3}Non-master ranks (rank != 0) must specify master_ip. The master rank (rank = 0) is reachable via master_ip.
Environment
Section titled “Environment”Ring backend selection is controlled by RING_CONFIG:
| Variable | Purpose |
|---|---|
RING_CONFIG | Path to the per-rank ring JSON config. |
MISTRALRS_NO_NCCL=1 | Required only when the same binary also has nccl and you want to force ring. |
Full env var reference: environment variables.
The ring backend is Linux-only. For CUDA tensor parallelism on one machine, prefer single-machine multi-GPU. For CUDA tensor parallelism across machines, prefer multi-node NCCL inference.