Re-exports§
pub use safetensors::Shard;
pub use safetensors::ShardedSafeTensors;
pub use safetensors::ShardedVarBuilder;
pub use distributed::layers::compute_kv_shard;
pub use distributed::layers::compute_n_kv_groups;
pub use distributed::layers::ColumnParallelLayer;
pub use distributed::layers::ReplicatedLayer;
pub use distributed::layers::RowParallelLayer;
pub use distributed::socket::Client;
pub use distributed::socket::Server;
pub use distributed::BarrierLike;
pub use distributed::Comm;
pub use distributed::Id;
pub use distributed::SumAllReduce;
Modules§
Structs§
- Device/configurable intelligent matrix multiplication
Enums§
Constants§
- Offset for the quant type. UQFF always serializes the version first.
Traits§
- Quantized method for a quantized matmul.
Functions§
- Static LoRA in the style of Phi-4 multimodal. Only when the layer regex for the specific LoRA matches.