Sessions
Agentic requests are stateful. Pass a session_id to keep state across requests; the response echoes it back. See Lifetime for eviction and TTL limits.
curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [{"role": "user", "content": "What is 2 to the 10th?"}], "session_id": "user-42-chat-abc" }'The response body includes a top-level session_id field.
How a request finds its session
Section titled “How a request finds its session”-
Explicit
session_idon the request: direct lookup. An existing session continues; an unknown id starts a new session under that id. -
No
session_id: the server falls back to content matching. It matches the incoming messages against the user-visible message prefix of a stored session, subject to two conditions:- tool-role entries in stored history are skipped;
- at least two incoming messages are required.
On a match, that session and its id are reused; otherwise a fresh id is generated. Either way the id is returned in the response.
Content matching exists for clients that cannot pass session_id. When two clients send identical opening messages, it can route them to the same session; pass an explicit session_id in correctness-sensitive deployments.
A session holds:
- the full message history, including tool-call records and synthesized tool messages;
- multimodal payloads from earlier turns;
- a handle to the Python code-execution subprocess, if any.
On match, stored history is spliced into the request so that editing a previous turn works while tool-call history is retained; the algorithm is documented in session memory internals.
Export, import, delete
Section titled “Export, import, delete”# Export (404 if the session does not exist)curl http://localhost:1234/v1/sessions/user-42-chat-abc
# Import or replace: body is a SerializedSession from a previous GETcurl -X PUT http://localhost:1234/v1/sessions/user-42-chat-abc \ -H "Content-Type: application/json" \ -d @saved-session.json
# Delete (idempotent: 200 whether or not the session existed)curl -X DELETE http://localhost:1234/v1/sessions/user-42-chat-abcPUT replaces any existing session with the same id and returns 400 for an invalid payload.
Runner exposes the same operations as the HTTP endpoints; each takes an optional model_id keyword for multi-model setups:
from mistralrs import ChatCompletionRequest, Runner, Which
runner = Runner(which=Which.Plain(model_id="Qwen/Qwen3-4B"))
ids = runner.list_session_ids()
exported = runner.export_session("user-42-chat-abc") # JSON string or Noneif exported is not None: runner.import_session("user-42-chat-new-id", exported)
runner.delete_session("user-42-chat-abc")In-process requests reuse agentic state by setting session_id on the request:
response = runner.send_chat_completion_request( ChatCompletionRequest( model="default", messages=[{"role": "user", "content": "Continue the analysis."}], session_id="user-42-chat-abc", ))Use HTTP when a Python application also needs the live agentic_tool_call_progress timeline (streamed tool-call progress events; see tool calling).
Model exposes export_session, import_session, delete_session, and list_session_ids, plus fork_session(model_id, src_session_id, dest_session_id, num_turns) to clone the first complete turns of a session under a new id. Like the other four methods, it takes a leading model_id: Option<&str> for multi-model setups. Request-level session_id is set via RequestBuilder::with_session_id:
let request = RequestBuilder::from(messages).with_session_id("user-42-chat-abc");Lifetime
Section titled “Lifetime”- Idle expiry: 30 minutes of inactivity.
- Capacity: 128-session cap with LRU eviction.
- Server restart: full loss unless exported and re-imported.
Code execution subprocess
Section titled “Code execution subprocess”Sessions with code execution hold a Python subprocess. The subprocess is not part of the exportable state: after import on another server, a fresh subprocess starts on the next code execution call. Variables and imports from the original interpreter are gone, but message and tool history carry over.