Skip to content

OpenAI-compatible file inputs

mistral.rs supports OpenAI-compatible user file inputs on the server API and local SDKs:

Use this when a request needs to analyze or transform user-provided files, or when a Responses shell/Skill workflow should see request files in the session working directory.

For field-level compatibility notes, see OpenAI compatibility. For wire schema and lifetime rules, see HTTP API file semantics.

file_data is decoded before it reaches the model. Base64 is never placed in prompt context.

Uploaded, inline, URL-fetched, and SDK-provided files are visible through the server file endpoints:

Text-like UTF-8 files get:

  • metadata in prompt context,
  • a decoded preview of up to 4096 chars per file,
  • a total preview budget of 32768 chars per request,
  • access to additional text during agentic tool runs when the preview is not enough.

Binary or non-UTF-8 files get metadata only in prompt context. They are still stored, downloadable from GET /v1/files/{id}/content, and mounted into shell/code workdirs when those tools are active.

Extraction support is intentionally simple:

File kindBehavior
Text-like UTF-8 filesText, CSV, JSON, XML, Markdown, YAML, TOML, HTML, source files, and other valid UTF-8 payloads are readable as text.
Binary or structured documentsPDFs, images, archives, spreadsheets, and other binary formats are stored and mounted, but mistral.rs does not extract OCR, PDF text, or spreadsheet summaries yet.

Responses file_url fetches only http and https URLs, with a timeout, redirect cap, decoded-size cap, and basic local/private host rejection. Chat Completions does not support file URLs; upload the file first or use inline file_data.

Start a server:

Terminal window
mistralrs serve --agent -m <model>

--agent is useful when you want file pagination, Python code execution, shell execution, or OpenAI-compatible Skills. Plain requests still receive text previews even without tool execution.

Related setup guides:

from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-used")
with open("data.csv", "rb") as file:
uploaded = client.files.create(file=file, purpose="user_data")
response = client.responses.create(
model="default",
input=[
{
"role": "user",
"content": [
{"type": "input_file", "file_id": uploaded.id},
{"type": "input_text", "text": "Summarize this CSV."},
],
}
],
)
print(response.output_text)

Full examples:

Use inline Responses input_file when you do not want a separate upload:

import base64
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-used")
with open("notes.md", "rb") as file:
data = base64.b64encode(file.read()).decode("utf-8")
response = client.responses.create(
model="default",
input=[
{
"role": "user",
"content": [
{
"type": "input_file",
"filename": "notes.md",
"file_data": f"data:text/markdown;base64,{data}",
},
{"type": "input_text", "text": "Extract the action items."},
],
}
],
)
print(response.output_text)

Chat Completions uses type: "file" content parts:

import base64
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-used")
with open("report.json", "rb") as file:
data = base64.b64encode(file.read()).decode("utf-8")
completion = client.chat.completions.create(
model="default",
messages=[
{
"role": "user",
"content": [
{
"type": "file",
"file": {
"filename": "report.json",
"file_data": f"data:application/json;base64,{data}",
},
},
{"type": "text", "text": "What are the main anomalies?"},
],
}
],
)
print(completion.choices[0].message.content)