Skip to content

OpenAI-compatible completions endpoint handler.

POST
/v1/completions
curl --request POST \
--url https://example.com/v1/completions \
--header 'Content-Type: application/json' \
--data '{ "best_of": 1, "dry_allowed_length": null, "dry_base": null, "dry_multiplier": null, "dry_sequence_breakers": null, "echo": false, "frequency_penalty": null, "grammar": { "type": "regex", "value": "example" }, "logit_bias": null, "logprobs": null, "max_tokens": 16, "min_p": null, "model": "mistral", "n": 1, "presence_penalty": null, "prompt": "Say this is a test.", "repetition_penalty": null, "stop": "example", "stream": true, "suffix": null, "temperature": 0.7, "tool_choice": "none", "tools": null, "top_k": null, "top_p": null, "truncate_sequence": null, "user": "example" }'
Media type application/json

Legacy OpenAI compatible text completion request

object
best_of
integer | null
Example
1
dry_allowed_length

Longest repeated sequence DRY leaves unpenalized.

integer | null
Example
null
dry_base

Base for DRY’s exponential penalty growth.

number | null format: float
Example
null
dry_multiplier

DRY repetition penalty multiplier; 0 disables DRY.

number | null format: float
Example
null
dry_sequence_breakers

Sequences that reset DRY repetition matching.

Array<string> | null
Example
null
echo

Echo the prompt back alongside the completion.

boolean
Example
false
frequency_penalty

Penalize tokens by how often they have appeared so far; positive values reduce repetition.

number | null format: float
Example
null
grammar
One of:
null
logit_bias

Bias added to the logits of these token IDs before sampling.

object | null
Example
null
logprobs

Include log probabilities of this many most likely tokens.

integer | null
Example
null
max_tokens

Maximum number of tokens to generate.

integer | null
Example
16
min_p

Drop tokens below this fraction of the top token’s probability.

number | null format: double
Example
null
model

Model ID; “default” targets the only loaded model.

string
Example
mistral
n

How many choices to generate.

integer
Example
1
presence_penalty

Penalize tokens that have already appeared; positive values push toward new topics.

number | null format: float
Example
null
prompt
required
string
Example
Say this is a test.
repetition_penalty

Multiplicative repetition penalty; 1.0 disables it.

number | null format: float
Example
null
stop
One of:
null
stream

Stream the response as server-sent events.

boolean | null
suffix

Text appended after the completion.

string | null
Example
null
temperature

Sampling temperature; higher values increase randomness.

number | null format: double
Example
0.7
tool_choice
One of:
null
tools

Tools the model may call.

Array<object> | null

Tool definition

object
function
required

Function definition for a tool

object
description
string | null
name
required
string
parameters
object | null
strict

When true, the tool’s parameters JSON schema is enforced on the generated arguments via constrained decoding (llguidance).

boolean | null
type
required

Type of tool

string
Allowed values: function
Example
null
top_k

Sample only from the k most likely tokens.

integer | null
Example
null
top_p

Nucleus sampling: only tokens within the top cumulative probability mass are considered.

number | null format: double
Example
null
truncate_sequence

Truncate inputs that exceed the model’s context length instead of erroring.

boolean | null
Example
null
user
string | null

Completions