Serve models

Tutorial 2 covers basic single-model serving. These guides cover configuration beyond a single local server, including OpenAI-compatible /v1 clients and Anthropic-compatible /v1/messages clients.

Choose by task

If you need to…	Start here
Change host, port, CORS, request limits, or authentication	HTTP server configuration
Serve more than one model from one process	Running multiple models
Use the browser chat interface	Using the web UI
Use OpenAI-compatible clients	OpenAI-compatible APIs
Use the newer OpenAI Responses endpoint	OpenAI Responses API
Use Anthropic-compatible clients	Anthropic Messages API
Use Codex or Claude Code with a local server	Use Codex and Claude Code

For operational concerns (reverse proxy, Docker, health checks, TLS), see the deployment guides.