Agentic sessions from Python
Sessions on the HTTP server are keyed by session id and persist message history, tool-call records, images, and (when applicable) the Python code-execution subprocess. See the persist-sessions guide for the underlying behavior.
In-process with Runner
Section titled “In-process with Runner”Runner exposes the same session operations as the HTTP endpoints:
from mistralrs import Runner, Which
runner = Runner(which=Which.Plain(model_id="Qwen/Qwen3-4B"))
ids = runner.list_session_ids()
exported = runner.export_session("user-42-chat-abc") # JSON string or Noneif exported is not None: runner.import_session("user-42-chat-new-id", exported)
runner.delete_session("user-42-chat-abc")Each method takes an optional model_id keyword argument for multi-model setups.
ChatCompletionRequest can carry a session_id, so in-process Python requests can reuse agentic state:
from mistralrs import ChatCompletionRequest
response = runner.send_chat_completion_request( ChatCompletionRequest( model="default", messages=[{"role": "user", "content": "Continue the analysis."}], session_id="user-42-chat-abc", ))Use HTTP when a Python application also needs the live agentic_tool_call_progress timeline.
Example: HTTP from Python
Section titled “Example: HTTP from Python”import requests
# Create a session implicitlyr = requests.post("http://localhost:1234/v1/chat/completions", json={ "model": "default", "messages": [{"role": "user", "content": "Research recent Rust releases."}], "session_id": "user-42-chat-abc",})print(r.json()["choices"][0]["message"]["content"])
# Continue the same sessionr = requests.post("http://localhost:1234/v1/chat/completions", json={ "model": "default", "messages": [{"role": "user", "content": "Summarize what you found."}], "session_id": "user-42-chat-abc",})Export, import, delete
Section titled “Export, import, delete”# Exportexported = requests.get( "http://localhost:1234/v1/sessions/user-42-chat-abc").json()
# Import elsewhererequests.put( "http://localhost:1234/v1/sessions/user-42-chat-abc", json=exported,)
# Deleterequests.delete("http://localhost:1234/v1/sessions/user-42-chat-abc")Lifetime
Section titled “Lifetime”Sessions are in-memory with a 30-minute idle TTL and 128-entry capacity (LRU). They do not survive a server restart unless exported and re-imported.
Code execution subprocess
Section titled “Code execution subprocess”If the session has an active Python subprocess (code execution), the subprocess is not part of the exportable state.