Clone the constraint
Clone a tokenizer.
This increments a reference count and does a small allocation.
Commit the token sampled with the mask returned from llg_compute_mask().
Can be run on the critical path of sampling (is fast).
Returns 0 on success and -1 on error (use llg_get_error() to get the exact error).
When 0 is returned, the result is written to *res_p.
Compute mask for the next token sampling
It typically takes up to a millisecond for a 100k tokenizer, so should be called in background.
Returns 0 on success and -1 on error (use llg_get_error() to get the exact error).
When 0 is returned, the result is written to *res_p.
Set the default values for the ConstraintInit
Disables ff_tokens and backtracking, enables warnings on stderr
and all logging to the buffer (get with llg_flush_logs()).
You need to set the tokenizer field manually.
Get the logs from the constraint, since last call to this function.
The logs are null-terminated.
The logs are kept in the constraint until the next call to this function
or until the constraint is freed.
Free the constraint
Free the tokenizer. Should NOT be called while there are still constraints using it.
Get the error message from the constraint or null if there is no error.
After it returns a non-null value, it will always return it until the constraint is freed
using llg_free_constraint() (at which point the pointer will be invalid).
Get the current temperature of the constraint.
It is updated by mask computation.
Check if constraint is stopped (cannot be extended further).
Create a new constraint from a grammar JSON string
Always returns a non-null value. Call llg_get_error() on the result to check for errors.
Create a new constraint with specified type
Type can be one of “regex”, “json_schema” (or “json”), “lark”, “llguidance” (or “guidance”)
Always returns a non-null value. Call llg_get_error() on the result to check for errors.
Create a new constraint from a given JSON schema
Always returns a non-null value. Call llg_get_error() on the result to check for errors.
Create a new constraint from a given lark grammar
Always returns a non-null value. Call llg_get_error() on the result to check for errors.
Create a new constraint from a given regular expression
Always returns a non-null value. Call llg_get_error() on the result to check for errors.
Construct a new tokenizer from the given TokenizerInit
Compute mask for several constraints in parallel.
Return a string representation of the tokens, useful for debugging.
The output is null-terminated.
Returns the number of bytes that would be written to output if output_len was large enough.
Tokenize the given bytes and return the tokens.
Always returns the number of tokens that would be written to output_tokens
if output_tokens_len was large enough.
Tokenize the given bytes and return the tokens.
Special tokens will be tokenized, if they follow 0xFF byte prefix.
Always returns the number of tokens that would be written to output_tokens
if output_tokens_len was large enough.