Struct LlgTokenizerInit
#[repr(C)]pub struct LlgTokenizerInit {
pub vocab_size: u32,
pub tok_eos: u32,
pub token_lens: *const u32,
pub token_bytes: *const u8,
pub tokenizer_json: *const i8,
pub tokenize_assumes_string: bool,
pub tokenize_fn: Option<extern "C" fn(_: *const c_void, _: *const u8, _: usize, _: *mut u32, _: usize) -> usize>,
pub use_approximate_greedy_tokenize_fn: bool,
pub tokenize_user_data: *const c_void,
}
Fields§
§vocab_size: u32
The number of tokens in the vocabulary
tok_eos: u32
The token ID for the end of sentence token For chat mode, set it to end-of-turn token
token_lens: *const u32
An array of the lengths of the token strings (vocab_size elements)
token_bytes: *const u8
A pointer to the token strings The length of this the sum of all token_lens
tokenizer_json: *const i8
Instead of passing token_lens and token_bytes, this can be set to the contents of HF tokenizer.json file.
tokenize_assumes_string: bool
Set to true to enable hack that works around the tokenize_fn only
accepting valid UTF-8 strings and possibly adding
tokenize_fn: Option<extern "C" fn(_: *const c_void, _: *const u8, _: usize, _: *mut u32, _: usize) -> usize>
Tokenization function, see LlgTokenizeFn docs.
It should only tokenize the bytes and not add
any
use_approximate_greedy_tokenize_fn: bool
Set to true to not use tokenize_fn and instead tokenize greedily, which is often incorrect and may reduce accuracy.
tokenize_user_data: *const c_void
User data to pass to the tokenize_fn
Auto Trait Implementations§
impl Freeze for LlgTokenizerInit
impl RefUnwindSafe for LlgTokenizerInit
impl !Send for LlgTokenizerInit
impl !Sync for LlgTokenizerInit
impl Unpin for LlgTokenizerInit
impl UnwindSafe for LlgTokenizerInit
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Downcast for Twhere
T: AsAny + ?Sized,
impl<T> Downcast for Twhere
T: AsAny + ?Sized,
§fn downcast_ref<T>(&self) -> Option<&T>where
T: AsAny,
fn downcast_ref<T>(&self) -> Option<&T>where
T: AsAny,
Any
.§fn downcast_mut<T>(&mut self) -> Option<&mut T>where
T: AsAny,
fn downcast_mut<T>(&mut self) -> Option<&mut T>where
T: AsAny,
Any
.§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more