Use the OpenAI Responses API
mistral.rs implements the OpenAI Responses API at /v1/responses alongside Chat Completions. Responses is OpenAI’s shape for agentic workloads with tool calls, background processing, and cancellation.
Both endpoints run on the same server.
Endpoints
Section titled “Endpoints”POST /v1/responses: create a new response. Returns a response object with a unique id.GET /v1/responses/{id}: fetch the current state, including any streamed deltas.DELETE /v1/responses/{id}: delete a response.POST /v1/responses/{id}/cancel: cancel a background response that has not finished.
Choosing an endpoint
Section titled “Choosing an endpoint”Responses supports polling, mid-flight cancellation via /cancel, and background processing. Chat Completions returns the full response on a single connection.
Function tools use the same OpenAI-compatible definitions as Chat Completions, including strict: true for JSON-Schema-constrained tool arguments. See strict tool calling.
Supported fields
Section titled “Supported fields”A few fields are accepted for compatibility but reject non-default values:
parallel_tool_callsmust betrue(default) or omitted.falsereturns an error.max_tool_callsis unsupported; any value returns an error. To cap tool rounds, use the server-level--max-tool-roundsflag (applies to both Chat Completions and Responses).
mistral.rs extensions
Section titled “mistral.rs extensions”Non-OpenAI fields accepted in Responses requests (also accepted on Chat Completions):
stop: custom stop sequences.repetition_penalty,top_k,min_p: sampling options not in OpenAI’s API.dry_multiplier,dry_base,dry_allowed_length,dry_sequence_breakers: DRY sampling.grammar: constrained generation via llguidance.web_search_options: per-request search behavior (matches OpenAI’s syntax).
Full field reference: HTTP API reference.
Example
Section titled “Example”Create a response:
curl http://localhost:1234/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "default", "input": "Summarize today in tech news.", "background": true }'Poll progress:
curl http://localhost:1234/v1/responses/resp_abc123Cancel:
curl -X POST http://localhost:1234/v1/responses/resp_abc123/cancelRequest and response schemas match OpenAI’s spec, with the additions listed above.