List of all items
Structs
- AnyMoeConfig
- AnyMoeLoader
- AnyMoePipeline
- CalledFunction
- ChatCompletionChunkResponse
- ChatCompletionResponse
- ChatTemplate
- Choice
- ChunkChoice
- CompletionChoice
- CompletionChunkChoice
- CompletionChunkResponse
- CompletionResponse
- Delta
- DetokenizationRequest
- DeviceLayerMapMetadata
- DeviceMapMetadata
- DiffusionGenerationParams
- DiffusionLoader
- DiffusionLoaderBuilder
- DiffusionSpecificConfig
- DrySamplingParams
- Function
- GGMLLoader
- GGMLLoaderBuilder
- GGMLSpecificConfig
- GGUFLoader
- GGUFLoaderBuilder
- GGUFSpecificConfig
- GemmaLoader
- Idefics2Loader
- ImageChoice
- ImageGenerationResponse
- LLaVALoader
- LLaVANextLoader
- LayerDeviceMapper
- LayerTopology
- LlamaLoader
- LoaderBuilder
- LocalModelPaths
- Logprobs
- MemoryUsage
- MistralLoader
- MistralRs
- MistralRsBuilder
- MistralRsConfig
- MixtralLoader
- NormalLoader
- NormalLoaderBuilder
- NormalRequest
- NormalSpecificConfig
- Ordering
- PagedAttentionConfig
- Phi2Loader
- Phi3Loader
- Phi3VLoader
- Qwen2Loader
- ResponseLogprob
- ResponseMessage
- SamplingParams
- SpeculativeConfig
- SpeculativeLoader
- SpeculativePipeline
- Starcoder2Loader
- TokenizationRequest
- Tool
- ToolCallResponse
- TopLogprob
- Topology
- Usage
- VisionLoader
- VisionLoaderBuilder
- VisionSpecificConfig
- layers::CausalMasker
- layers::Conv3dConfig
- layers::Conv3dNoBias
- layers::DeepSeekV2RopeConfig
- layers::DeepSeekV2RotaryEmbedding
- layers::F32RmsNorm
- layers::Llama3RopeConfig
- layers::MatMul
- layers::PhiRopeConfig
- layers::PhiRotaryEmbedding
- layers::QLinear
- layers::QRmsNorm
- layers::Qwen2VLRotaryEmbedding
- layers::RmsNorm
- layers::RotaryEmbedding
- layers::Sdpa
Enums
- AnyMoeExpertType
- Constraint
- DefaultSchedulerMethod
- DiffusionLoaderType
- EngineInstruction
- GGUFArchitecture
- ImageGenerationResponseFormat
- IsqOrganization
- IsqType
- MemoryGpuConfig
- MistralRsError
- ModelCategory
- ModelDType
- ModelKind
- ModelSelected
- NormalLoaderType
- Request
- RequestMessage
- Response
- ResponseErr
- ResponseOk
- SchedulerConfig
- StopTokens
- TokenSource
- ToolCallType
- ToolChoice
- ToolType
- VisionLoaderType
- layers::Activation
- layers::DeepSeekV2RopeScaling
- layers::Llama3RopeType
- layers::Llama3RotaryEmbedding
- layers::PhiRopeScalingConfig
- layers::ScaledRopeType
Traits
Functions
- get_model_dtype
- get_tgt_non_granular_index
- get_toml_selected_model_dtype
- initialize_logging
- layers::get_use_matmul_via_f16
- layers::repeat_kv
- paged_attn_supported
- parse_isq_value