Struct LlgTokenizerInit
#[repr(C)]pub struct LlgTokenizerInit {
pub vocab_size: u32,
pub tok_eos: u32,
pub token_lens: *const u32,
pub token_bytes: *const u8,
pub tokenizer_json: *const i8,
pub tokenize_assumes_string: bool,
pub tokenize_fn: Option<extern "C" fn(*const c_void, *const u8, usize, *mut u32, usize) -> usize>,
pub use_approximate_greedy_tokenize_fn: bool,
pub tokenize_user_data: *const c_void,
pub slices: *const *const i8,
}Fields§
§vocab_size: u32The number of tokens in the vocabulary
tok_eos: u32The token ID for the end of sentence token For chat mode, set it to end-of-turn token
token_lens: *const u32An array of the lengths of the token strings (vocab_size elements)
token_bytes: *const u8A pointer to the token strings The length of this the sum of all token_lens
tokenizer_json: *const i8Instead of passing token_lens and token_bytes, this can be set to the contents of HF tokenizer.json file.
tokenize_assumes_string: boolSet to true to enable hack that works around the tokenize_fn only
accepting valid UTF-8 strings and possibly adding
tokenize_fn: Option<extern "C" fn(*const c_void, *const u8, usize, *mut u32, usize) -> usize>Tokenization function, see LlgTokenizeFn docs.
It should only tokenize the bytes and not add
any
use_approximate_greedy_tokenize_fn: boolSet to true to not use tokenize_fn and instead tokenize greedily, which is often incorrect and may reduce accuracy.
tokenize_user_data: *const c_voidUser data to pass to the tokenize_fn
slices: *const *const i8Tokenizer partitions for the slicer optimization. This is array of pointers to strings, terminated with NULL (argv style). Pass NULL to use defaults. Pass empty array to disable.
Auto Trait Implementations§
impl Freeze for LlgTokenizerInit
impl RefUnwindSafe for LlgTokenizerInit
impl !Send for LlgTokenizerInit
impl !Sync for LlgTokenizerInit
impl Unpin for LlgTokenizerInit
impl UnwindSafe for LlgTokenizerInit
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Downcast for Twhere
T: AsAny + ?Sized,
impl<T> Downcast for Twhere
T: AsAny + ?Sized,
§fn downcast_ref<T>(&self) -> Option<&T>where
T: AsAny,
fn downcast_ref<T>(&self) -> Option<&T>where
T: AsAny,
Any.§fn downcast_mut<T>(&mut self) -> Option<&mut T>where
T: AsAny,
fn downcast_mut<T>(&mut self) -> Option<&mut T>where
T: AsAny,
Any.§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
fn into_either(self, into_left: bool) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self> ⓘ
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more§impl<F, T> IntoSample<T> for Fwhere
T: FromSample<F>,
impl<F, T> IntoSample<T> for Fwhere
T: FromSample<F>,
fn into_sample(self) -> T
§impl<T> Pointable for T
impl<T> Pointable for T
§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read more§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.