Struct LlgTokenizerInit

#[repr(C)]
pub struct LlgTokenizerInit { pub vocab_size: u32, pub tok_eos: u32, pub token_lens: *const u32, pub token_bytes: *const u8, pub tokenizer_json: *const i8, pub tokenize_assumes_string: bool, pub tokenize_fn: Option<extern "C" fn(_: *const c_void, _: *const u8, _: usize, _: *mut u32, _: usize) -> usize>, pub use_approximate_greedy_tokenize_fn: bool, pub tokenize_user_data: *const c_void, pub slices: *const *const i8, }

Fields§

§vocab_size: u32

The number of tokens in the vocabulary

§tok_eos: u32

The token ID for the end of sentence token For chat mode, set it to end-of-turn token

§token_lens: *const u32

An array of the lengths of the token strings (vocab_size elements)

§token_bytes: *const u8

A pointer to the token strings The length of this the sum of all token_lens

§tokenizer_json: *const i8

Instead of passing token_lens and token_bytes, this can be set to the contents of HF tokenizer.json file.

§tokenize_assumes_string: bool

Set to true to enable hack that works around the tokenize_fn only accepting valid UTF-8 strings and possibly adding etc. TODO: the bit not implemented yet

§tokenize_fn: Option<extern "C" fn(_: *const c_void, _: *const u8, _: usize, _: *mut u32, _: usize) -> usize>

Tokenization function, see LlgTokenizeFn docs. It should only tokenize the bytes and not add any etc. It should also work on any byte sequence, including invalid UTF-8. If this is not the case, set tokenize_assumes_string to true. Either way, this function has to be thread-safe!

§use_approximate_greedy_tokenize_fn: bool

Set to true to not use tokenize_fn and instead tokenize greedily, which is often incorrect and may reduce accuracy.

§tokenize_user_data: *const c_void

User data to pass to the tokenize_fn

§slices: *const *const i8

Tokenizer partitions for the slicer optimization. This is array of pointers to strings, terminated with NULL (argv style). Pass NULL to use defaults. Pass empty array to disable.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
§

impl<T> AsAny for T
where T: Any,

§

fn as_any(&self) -> &(dyn Any + 'static)

§

fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)

§

fn type_name(&self) -> &'static str

Gets the type name of self
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
§

impl<T> Downcast for T
where T: AsAny + ?Sized,

§

fn is<T>(&self) -> bool
where T: AsAny,

Returns true if the boxed type is the same as T. Read more
§

fn downcast_ref<T>(&self) -> Option<&T>
where T: AsAny,

Forward to the method defined on the type Any.
§

fn downcast_mut<T>(&mut self) -> Option<&mut T>
where T: AsAny,

Forward to the method defined on the type Any.
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> ErasedDestructor for T
where T: 'static,