Skip to content

Sessions

Agentic requests are stateful. Pass a session_id to keep state across requests; the response echoes it back. See Lifetime for eviction and TTL limits.

Terminal window
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"messages": [{"role": "user", "content": "What is 2 to the 10th?"}],
"session_id": "user-42-chat-abc"
}'

The response body includes a top-level session_id field.

  • Explicit session_id on the request: direct lookup. An existing session continues; an unknown id starts a new session under that id.

  • No session_id: the server falls back to content matching. It matches the incoming messages against the user-visible message prefix of a stored session, subject to two conditions:

    • tool-role entries in stored history are skipped;
    • at least two incoming messages are required.

    On a match, that session and its id are reused; otherwise a fresh id is generated. Either way the id is returned in the response.

Content matching exists for clients that cannot pass session_id. When two clients send identical opening messages, it can route them to the same session; pass an explicit session_id in correctness-sensitive deployments.

A session holds:

  • the full message history, including tool-call records and synthesized tool messages;
  • multimodal payloads from earlier turns;
  • a handle to the Python code-execution subprocess, if any.

On match, stored history is spliced into the request so that editing a previous turn works while tool-call history is retained; the algorithm is documented in session memory internals.

Terminal window
# Export (404 if the session does not exist)
curl http://localhost:1234/v1/sessions/user-42-chat-abc
# Import or replace: body is a SerializedSession from a previous GET
curl -X PUT http://localhost:1234/v1/sessions/user-42-chat-abc \
-H "Content-Type: application/json" \
-d @saved-session.json
# Delete (idempotent: 200 whether or not the session existed)
curl -X DELETE http://localhost:1234/v1/sessions/user-42-chat-abc

PUT replaces any existing session with the same id and returns 400 for an invalid payload.

  • Idle expiry: 30 minutes of inactivity.
  • Capacity: 128-session cap with LRU eviction.
  • Server restart: full loss unless exported and re-imported.

Sessions with code execution hold a Python subprocess. The subprocess is not part of the exportable state: after import on another server, a fresh subprocess starts on the next code execution call. Variables and imports from the original interpreter are gone, but message and tool history carry over.