An OpenAI compatible chat completion response.
An OpenAI compatible completion response.
A builder for a GGML loader.
Config for a GGML loader.
A builder for a GGUF loader.
A config for a GGUF loader.
The MistralRs struct handles sending requests to the engine.
It is the core multi-threaded component of mistral.rs, and uses mspc
Sender
and Receiver
primitives to send and receive requests to the
engine.
The MistralRsBuilder takes the pipeline and a scheduler method and constructs
an Engine and a MistralRs instance. The Engine runs on a separate thread, and the MistralRs
instance stays on the calling thread.
A loader for a “normal” (non-quantized) model.
A builder for a loader for a “normal” (non-quantized) model.
Config specific to loading a normal model.
Sampling params are used to control sampling.
OpenAI compatible (superset) usage during a request.