What this does. The model decides which of your registered tools to invoke and generates typed arguments. You execute the tool, return the result, the model continues. When to use it. Anything where the model needs to act on the world: call an API, query a database, run a calculation, fetch a document.Documentation Index
Fetch the complete documentation index at: https://runinfra.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Minimal code
What to tune
| Parameter | Effect |
|---|---|
tool_choice: "auto" | Model chooses when to call a tool (default) |
tool_choice: "required" | Force a tool call every turn, no free-form response |
tool_choice: {type:"function", function:{name:"..."}} | Force a specific tool |
parallel_tool_calls: false | Disable parallel calls (default is true on capable models) |
Common mistakes
- Appending the assistant message incorrectly. When the model returns tool calls, push the entire assistant message (with
tool_calls) to history, then push onetoolrole message per call. Dropping the assistant turn breaks the state. - Non-JSON-serializable tool results. The
contenton a tool-role message must be a string. Alwaysjson.dumps(...)your return value. - Model not calling tools. The
descriptionon the function matters. Write it as if the model has never seen the API before: what the tool does, what each arg means, example inputs. - Infinite loops. Add a max turn count around the
while Trueloop. 10 turns is plenty for most patterns. - Argument validation. The model can hallucinate required fields. Validate with Pydantic or Zod before executing.
Next steps
Structured output
When you want JSON back but don’t need a tool loop.
RAG
Retrieval with tools as the search interface.
Streaming
Stream assistant tokens and tool calls.
OpenAI compatibility
The tool-calling contract in detail.