Skip to content

Web search

--enable-search exposes a web_search tool to the model.

The built-in search and extraction tools use strict tool calling by default, so generated queries and URLs are constrained to the declared JSON Schema.

Terminal window
mistralrs serve --enable-search -m <model>

The built-in backend uses DuckDuckGo (https://html.duckduckgo.com/html/?q=...). Up to 10 results are returned per query. Results pass through a readability-style extractor.

Retrieved results pass through an embedding-based reranker before reaching the model. To enable a reranker:

Terminal window
mistralrs serve --enable-search \
--search-embedding-model embedding-gemma \
-m <model>

--search-embedding-model accepts embedding-gemma. It requires --enable-search (or --agent/--agentic, which turns search on as part of the one-flag agent preset).

The OpenAI web_search_options field controls per-request behavior:

{
"model": "default",
"messages": [{"role": "user", "content": "What happened at CES this year?"}],
"web_search_options": {
"search_context_size": "medium"
}
}

Fields on WebSearchOptions:

  • search_context_size: low, medium (default), high.
  • user_location: optional location hint.
  • search_description: optional description shown to the model.
  • extract_description: optional description for content extraction.

The Python and Rust SDKs accept a search_callback. The callback receives a query string and returns a list of result dicts. Used for searching internal corpora.

Python:

def my_search(query: str) -> list[dict]:
return [
{"title": "...", "description": "...", "url": "internal://...", "content": "..."},
...
]
runner = Runner(
which=Which.Plain(model_id="Qwen/Qwen3-4B"),
enable_search=True,
search_callback=my_search,
)