mistralrs
An object wrapping the underlying Rust system to handle requests and process conversations.
Send an OpenAI API compatible request, returning the result.
Send an OpenAI API compatible request, returning the result.
Send a request to re-ISQ the model. If the model was loaded as GGUF or GGML then nothing will happen.
An OpenAI API compatible chat completion request.
An OpenAI API compatible completion request.
Chat completion response message.
Delta in content for streaming response.
A logprob with the top logprobs for this token.
Logprobs per token.
Chat completion choice.
Chat completion streaming chunk choice.
OpenAI compatible (superset) usage during a request.
An OpenAI compatible chat completion response.
Chat completion streaming request chunk.
Completion request choice.
An OpenAI compatible completion response.
Top-n logprobs element
DType for the model.
If the model is quantized, this is ignored so it is reasonable to use the [Default
] impl.
Note: When using Auto
, fallback pattern is: BF16 -> F16 -> 32
Image generation response format