mistralrs
An object wrapping the underlying Rust system to handle requests and process conversations.
Send an OpenAI API compatible request, returning the result.
Send an OpenAI API compatible request, returning the result.
Generate an image.
Send a request to re-ISQ the model. If the model was loaded as GGUF or GGML then nothing will happen.
Tokenize some text, returning raw tokens.
Detokenize some tokens, returning text.
A multi-model runner that provides a cleaner interface for managing multiple models. This wraps the existing Runner and provides model-specific methods.
Send a chat completion request to a specific model.
Send a completion request to a specific model.
Send a chat completion request to the default model.
Send a completion request to the default model.
Generate an image using the default model.
Tokenize some text using the default model.
An OpenAI API compatible chat completion request.
An OpenAI API compatible completion request.
Chat completion response message.
Delta in content for streaming response.
A logprob with the top logprobs for this token.
Logprobs per token.
Chat completion choice.
Chat completion streaming chunk choice.
OpenAI compatible (superset) usage during a request.
An OpenAI compatible chat completion response.
Chat completion streaming request chunk.
Completion request choice.
An OpenAI compatible completion response.
Top-n logprobs element
DType for the model.
If the model is quantized, this is ignored so it is reasonable to use the [Default
] impl.
Note: When using Auto
, fallback pattern is: BF16 -> F16 -> 32
Image generation response format
MCP server source configuration for different transport types
Configuration for an individual MCP server
Configuration for MCP client integration