Chat templates
A chat template formats messages into the string the model receives. The wrong format produces output that is coherent but degraded. mistral.rs resolves the template from the model files automatically for almost every supported model; when that fails, override it with a template file:
mistralrs run -m <model> --chat-template my-template.jinjaHow the template is resolved
Section titled “How the template is resolved”Highest priority first: a multimodal processor_config.json template wins over everything; then --jinja-explicit; then --chat-template; then the model repo's own template; then GGUF metadata as a last resort.
The base source (when no override flag is set) is picked in this order:
chat_template.jinjain the model repo, if present.- Otherwise, the
chat_templatefield of the repo'stokenizer_config.json. - If still empty, the repo's standalone
chat_template.json(some newer models ship the template as this separate file).
Overrides apply on top of that base (the precedence summary above lists them highest-first):
--chat-template <file>replaces the base source. The file must end in.jsonor.jinja.--jinja-explicit <file.jinja>overrides--chat-template. The value must be a path to a.jinjafile.- For multimodal models, a
chat_templateinprocessor_config.jsontakes precedence over everything above.
GGUF models are a special case: the template embedded in GGUF metadata is used only when none of the sources above produced one.
If nothing is found, the engine logs No chat template source found and only raw completion prompts are accepted, not chat messages.
Symptoms of a wrong template:
- Output quality below expectations.
- Special tokens (
<|im_start|>,<bos>, etc.) leaking into output. - Multi-turn degrading faster than single-turn.
- System prompts ignored or treated as user input.
Overriding the template
Section titled “Overriding the template”Both override settings take a file path. A .jinja file is the raw template; special tokens (bos_token, eos_token, unk_token) are read from a tokenizer_config.json next to the template file, falling back to the model's tokenizer.json. A .json file carries the template in a chat_template field and may set the special tokens itself:
{ "chat_template": "{% for message in messages %}...{% endfor %}", "bos_token": "<s>", "eos_token": "</s>"}mistralrs run -m <model> --chat-template my-template.jinja--jinja-explicit <file.jinja> overrides --chat-template when both are set and only accepts .jinja files. Both flags work on run and serve.
runner = Runner( which=Which.Plain(model_id="<model>"), chat_template="my-template.jinja", # or a .json file # jinja_explicit="my-template.jinja",)let model = ModelBuilder::new("<model>") .with_chat_template("my-template.jinja") // .with_jinja_explicit("my-template.jinja".to_string()) .build() .await?;Bundled templates
Section titled “Bundled templates”The source repository's chat_templates/ directory contains ready-made templates: .json templates for common formats (chatml, llama2, llama3, mistral, phi3, vicuna, ...) and .jinja tool-calling templates (Mistral Nemo/Small, Hermes 2 Pro/3, DeepSeek, SmolLM3, Gemma 3n). They are plain files, not built into the binary; download or clone them and pass a path:
mistralrs run -m <model> --chat-template chat_templates/llama3.jsonWriting a template from scratch
Section titled “Writing a template from scratch”Templates are Jinja with Hugging Face conventions, rendered by minijinja with Python-style string methods (.strip(), etc.) enabled. General pattern:
{% if messages[0]['role'] == 'system' %}{{ bos_token }}<|system|>{{ messages[0]['content'] }}<|end|>{% endif %}{% for msg in messages[(1 if messages[0]['role'] == 'system' else 0):] %}{% if msg['role'] == 'user' %}<|user|>{{ msg['content'] }}<|end|>{% elif msg['role'] == 'assistant' %}<|assistant|>{{ msg['content'] }}<|end|>{% endif %}{% endfor %}{% if add_generation_prompt %}<|assistant|>{% endif %}Variables available at render time:
messages: the chat message list.add_generation_prompt: true when building a prompt for generation.bos_token,eos_token,unk_token: model special tokens.date_string: the current UTC date as a preformatted string (DD, Month, YYYY, e.g.13, June, 2026).enable_thinking,reasoning_effort: reasoning controls for models that use them.toolsandbuiltin_tools: present only when the request carries tool schemas.
The functions raise_exception(msg) and strftime_now(fmt) and the tojson filter are available, matching Hugging Face's template environment. For model-specific tokens and role markers, the model's Hugging Face page is authoritative.
Multimodal models need templates that handle non-text content parts, usually via placeholder tokens like <|image|> or <|audio|>; most multimodal repos ship theirs in processor_config.json or chat_template.json, which mistral.rs picks up automatically.