Skip to content

Generate images with diffusion models

mistral.rs serves diffusion models through POST /v1/images/generations. The main supported model is FLUX; see the supported models reference.

Terminal window
mistralrs serve -m black-forest-labs/FLUX.1-schnell

FLUX.1-schnell is permissively licensed. FLUX.1-dev requires Hugging Face license acceptance, same flow as the Gemma setup.

For low-memory hosts, use the offloaded architecture. It keeps far less on the GPU at the cost of much slower generation:

LoaderGPU memory targetNotes
Fluxabout 33 GBFully loaded path. Fastest.
FluxOffloadedabout 4 GBCPU offload path. Useful when the full model does not fit.

The model is roughly 12B parameters, and the T5 XXL text encoder adds a large memory footprint. Diffusion models do not support ISQ.

Generating an image:

Terminal window
curl http://localhost:1234/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"prompt": "A photograph of a golden retriever wearing a scarf in autumn leaves.",
"n": 1,
"height": 1024,
"width": 1024
}'

The response is JSON with a data array. Each entry has url (server-side filename where the PNG was written) or b64_json (a data:image/png;base64,... data URL string) depending on the response_format field. The default is Url.

FieldDefaultNotes
promptrequiredText prompt.
n1Number of images.
height720Output height in pixels.
width1280Output width in pixels.
response_format"Url""Url" (response carries a server-side filename in url) or "B64Json" (response carries a data:image/png;base64,... string in b64_json).

size (the OpenAI string form) is not supported. Use height and width.

FLUX is memory-hungry at native precision. Diffusion models do not support runtime ISQ; load them at native precision instead of passing --quant or --isq.

from mistralrs import (
DiffusionArchitecture,
ImageGenerationResponseFormat,
Runner,
Which,
)
runner = Runner(
which=Which.DiffusionPlain(
model_id="black-forest-labs/FLUX.1-schnell",
arch=DiffusionArchitecture.FluxOffloaded,
)
)
response = runner.generate_image(
"A vibrant sunset in the mountains, high quality.",
ImageGenerationResponseFormat.Url,
)
print(response.data[0].url)

With Url (the default), the server writes the PNG to disk and returns its filename in url:

import shutil
saved = response["data"][0]["url"]
shutil.copy(saved, "out.png")

With B64Json, b64_json is a data:image/png;base64,... string. Strip the prefix before decoding:

import base64, re
data_url = response["data"][0]["b64_json"]
payload = re.sub(r"^data:image/\w+;base64,", "", data_url)
with open("out.png", "wb") as f:
f.write(base64.b64decode(payload))

The same endpoint is callable from the Rust SDK via Model::generate_image.