Skip to content

Hardware support

The authoritative list of supported accelerators. Other pages link here rather than restating it.

PlatformAccelerationPrebuilt binary
Linux x86_64 + NVIDIA GPUCUDA (Ampere and newer)yes, per compute capability
Linux aarch64 + NVIDIA GPUCUDA (Grace: GH200/GB200/GB10)yes, sm90/100/121
Apple Silicon (macOS arm64)Metalyes
Linux x86_64 / aarch64, no GPUCPUyes
Windows x86_64CPUyes
Intel Mac, unlisted GPUsource buildno

The minimum supported NVIDIA GPU is Ampere (compute capability 8.0). Turing (sm75: RTX 20-series, GTX 16-series, Tesla T4) and older are not supported: candle’s pre-Ampere CUDA path no longer builds against current toolkits. Such GPUs can still attempt a source build with an older CUDA toolkit, but this is untested and unsupported.

Compute capabilityArchitectureRepresentative GPUs
8.0Ampere (datacenter)A100, A30
8.6Ampere (consumer)RTX 3090/3080/3070/3060, A40, A10
8.9AdaRTX 4090/4080, L40, L4
9.0HopperH100, H200
10.0Blackwell (datacenter)B200, GB200
12.0Blackwell (consumer)RTX 5090/5080
12.1GB10DGX Spark

Prebuilt CUDA artifacts are published per compute capability:

ArchitectureCompute capabilities
x86_6480, 86, 89, 90, 100, 120
aarch64 (NVIDIA Grace: GH200, GB200, GB10/DGX Spark)90, 100, 121

The install script downloads the binary matching your GPU and architecture; a GPU outside this set builds from source. The same binaries back the Docker images, and the same compute capabilities are published as Python wheels (install with --find-links - see Python getting started). Each is self-contained: bundled CUDA runtime libraries, no toolkit needed at runtime.

FeatureRequirement
flash-attn (v2)compute capability 8.0+
flash-attn-v3Hopper (9.0)
FP8 matmulcompute capability 8.9+
cuTile MoE backendAmpere/Ada (8.x) or Blackwell+ (10.x/12.x), not Hopper; CUDA >= 13.1
CUTLASS MoE backendcompute capability 8.0+

See cargo features for the feature flags and MoE expert backends for backend selection.