Customize
Control how a model loads and generates:
- LoRA, X-LoRA, and AnyMoE - attach fine-tuned adapters or compose experts at inference time.
- Chat templates - override the prompt format when auto-detection is wrong.
- Sampling parameters - temperature, top-k/top-p/min-p, penalties, and DRY.