Skip to content

Requests

A ChatCompletionRequest represents a request sent to the mistral.rs engine. It encodes information about input data, sampling, and how to return the response.

The messages type is as follows: (for normal chat completion, for chat completion with images, pretemplated prompt)

Agent permission fields:

  • agent_permission: AgentPermission.Auto, .Ask, or .Deny. Applies to server-executed agent actions such as code execution, web search, file tools, callbacks, and external tool dispatch.
  • agent_approval_callback: called when agent_permission=AgentPermission.Ask with an AgentToolApproval. Return True, False, or AgentToolApprovalDecision.

See agent permissions for the shared CLI, HTTP, Python, and Rust behavior.

FieldTypeDefault
messageslist[dict[str, str]] | list[dict[str, list[dict[str, str | dict[str, str]]]]] | strrequired
modelstrrequired
logprobsboolFalse
n_choicesint1
logit_biasdict[int, float] | NoneNone
top_logprobsint | NoneNone
max_tokensint | NoneNone
presence_penaltyfloat | NoneNone
frequency_penaltyfloat | NoneNone
repetition_penaltyfloat | NoneNone
stop_seqslist[str] | NoneNone
temperaturefloat | NoneNone
top_pfloat | NoneNone
top_kint | NoneNone
streamboolFalse
grammarstr | NoneNone
grammar_typestr | NoneNone
min_pfloat | NoneNone
tool_schemaslist[str] | NoneNone
tool_choiceToolChoice | NoneNone
dry_multiplierfloat | NoneNone
dry_basefloat | NoneNone
dry_allowed_lengthint | NoneNone
dry_sequence_breakerslist[str] | NoneNone
web_search_optionsWebSearchOptions | NoneNone
enable_thinkingbool | NoneNone
truncate_sequenceboolFalse
reasoning_effortstr | NoneNone
max_tool_roundsint | NoneNone
tool_dispatch_urlstr | NoneNone
enable_code_executionboolFalse
agent_permissionAgentPermission | NoneNone
agent_approval_callbackCallable[[AgentToolApproval], bool | AgentToolApprovalDecision] | NoneNone
code_execution_permissionCodeExecutionPermission | NoneNone
session_idstr | NoneNone
fileslist[RequestedFile] | NoneNone

A CompletionRequest represents a request sent to the mistral.rs engine. It encodes information about input data, sampling, and how to return the response.

FieldTypeDefault
promptstrrequired
modelstrrequired
best_ofint1
echo_promptboolFalse
presence_penaltyfloat | NoneNone
frequency_penaltyfloat | NoneNone
repetition_penaltyfloat | NoneNone
logit_biasdict[int, float] | NoneNone
max_tokensint | NoneNone
n_choicesint1
stop_seqslist[str] | NoneNone
temperaturefloat | NoneNone
top_pfloat | NoneNone
suffixstr | NoneNone
top_kint | NoneNone
grammarstr | NoneNone
grammar_typestr | NoneNone
min_pfloat | NoneNone
tool_schemaslist[str] | NoneNone
tool_choiceToolChoice | NoneNone
dry_multiplierfloat | NoneNone
dry_basefloat | NoneNone
dry_allowed_lengthint | NoneNone
dry_sequence_breakerslist[str] | NoneNone
truncate_sequenceboolFalse

An EmbeddingRequest represents a request to compute embeddings for the provided input text.

FieldTypeDefault
inputstr | list[str] | list[int] | list[list[int]]required
truncate_sequenceboolFalse

Generated from mistralrs-pyo3/mistralrs.pyi.