mistralrs_quant::distributed

Module layers

Source

Structs§

  • This layer has a weight that is parallelized along the output dimension, taking the “full” input dimension.
  • This layer has no parallelization
  • This layer has a weight that is parallelized along the input dimension, returning the “full” output dimension.

Functions§

  • Compute the appropriate KV shard. This handles KV head replication. Be sure to use compute_n_kv_groups in tandem.
  • Compute the number of KV groups, taking into account KV head replication.