List of all items
Structs
- AnyMoeConfig
- AnyMoeLoader
- AnyMoePipeline
- ApproximateUserLocation
- CalledFunction
- ChatCompletionChunkResponse
- ChatCompletionResponse
- ChatTemplate
- Choice
- ChunkChoice
- CompletionChoice
- CompletionChunkChoice
- CompletionChunkResponse
- CompletionResponse
- Delta
- DetokenizationRequest
- DeviceLayerMapMetadata
- DeviceMapMetadata
- DiffusionGenerationParams
- DiffusionLoader
- DiffusionLoaderBuilder
- DiffusionSpecificConfig
- DrySamplingParams
- Function
- GGMLLoader
- GGMLLoaderBuilder
- GGMLSpecificConfig
- GGUFLoader
- GGUFLoaderBuilder
- GGUFSpecificConfig
- GemmaLoader
- Idefics2Loader
- ImageChoice
- ImageGenerationResponse
- LLaVALoader
- LLaVANextLoader
- LayerDeviceMapper
- LayerTopology
- LlamaLoader
- LoaderBuilder
- LocalModelPaths
- Logprobs
- MemoryUsage
- MistralLoader
- MistralRs
- MistralRsBuilder
- MistralRsConfig
- MixtralLoader
- NormalLoader
- NormalLoaderBuilder
- NormalRequest
- NormalSpecificConfig
- Ordering
- PagedAttentionConfig
- Phi2Loader
- Phi3Loader
- Phi3VLoader
- Qwen2Loader
- ResponseLogprob
- ResponseMessage
- SamplingParams
- SpeculativeConfig
- SpeculativeLoader
- SpeculativePipeline
- Starcoder2Loader
- TokenizationRequest
- Tool
- ToolCallResponse
- TopLogprob
- Topology
- Usage
- VisionLoader
- VisionLoaderBuilder
- VisionSpecificConfig
- WebSearchOptions
- layers::AvgPool2d
- layers::CausalMasker
- layers::Conv3dConfig
- layers::Conv3dNoBias
- layers::DeepSeekV2RopeConfig
- layers::DeepSeekV2RotaryEmbedding
- layers::F32RmsNorm
- layers::FloatInfo
- layers::Gemma3RopeScalingConfig
- layers::Gemma3RotaryEmbedding
- layers::Llama3RopeConfig
- layers::Llama3RotaryEmbedding
- layers::MatMul
- layers::Mlp
- layers::Phi4MMRopeScalingConfig
- layers::Phi4MMRotaryEmbedding
- layers::PhiRopeConfig
- layers::PhiRotaryEmbedding
- layers::QLinear
- layers::QRmsNorm
- layers::Qwen2VLRotaryEmbedding
- layers::Qwen2_5VLRotaryEmbedding
- layers::ReflectionPad2d
- layers::RmsNorm
- layers::RotaryEmbedding
- layers::ScaledEmbedding
- layers::Sdpa
Enums
- AnyMoeExpertType
- AutoDeviceMapParams
- BertEmbeddingModel
- Constraint
- DefaultSchedulerMethod
- DeviceMapSetting
- DiffusionLoaderType
- EngineInstruction
- GGUFArchitecture
- ImageGenerationResponseFormat
- IsqOrganization
- IsqType
- MemoryGpuConfig
- MistralRsError
- ModelCategory
- ModelDType
- ModelKind
- ModelSelected
- NormalLoaderType
- Request
- RequestMessage
- Response
- ResponseErr
- ResponseOk
- SchedulerConfig
- StopTokens
- TokenSource
- ToolCallType
- ToolChoice
- ToolType
- VisionLoaderType
- WebSearchUserLocation
- layers::Activation
- layers::DeepSeekV2RopeScaling
- layers::Gemmma3ScaledRopeType
- layers::Llama3RopeType
- layers::Phi4MMScaledRopeType
- layers::PhiRopeScalingConfig
- layers::ScaledRopeType
Traits
- CustomLogitsProcessor
- Loader
- ModelPaths
- Pipeline
- TryIntoDType
- VisionPromptPrefixer
- layers::GetFloatInfo
- layers::TensorInfExtend
Functions
- distributed::is_daemon
- distributed::use_nccl
- get_auto_device_map_params
- get_model_dtype
- get_tgt_non_granular_index
- get_toml_selected_model_device_map_params
- get_toml_selected_model_dtype
- initialize_logging
- layers::clamp_for_f16
- layers::conv2d
- layers::conv2d_no_bias
- layers::embedding
- layers::group_norm
- layers::layer_norm
- layers::linear
- layers::linear_b
- layers::linear_no_bias
- layers::repeat_kv
- paged_attn_supported
- parse_isq_value
- using_flash_attn