Customize
- LoRA and X-LoRA adapters: attach fine-tuned adapters to a base model.
- AnyMoE: mix multiple experts at inference time.
- MatFormer: elastic model sizing from a MatFormer-trained checkpoint.
- Chat templates: when auto-detection is wrong or missing.
- Sampling parameters: temperature, top-k, top-p, min-p, DRY, and their interactions.