Tool calling - RunInfra

What this does. The model decides which of your registered tools to invoke and generates typed arguments. You execute the tool, return the result, the model continues. When to use it. Anything where the model needs to act on the world: call an API, query a database, run a calculation, fetch a document.

Minimal code

import json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.runinfra.ai/v1",
    api_key="YOUR_RUNINFRA_API_KEY",
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

def get_weather(city: str) -> dict:
    return {"city": city, "temp_c": 21, "conditions": "partly cloudy"}

messages = [{"role": "user", "content": "What's the weather in Paris?"}]

while True:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=messages,
        tools=tools,
    )
    msg = response.choices[0].message
    messages.append(msg)

    if not msg.tool_calls:
        print(msg.content)
        break

    for call in msg.tool_calls:
        args = json.loads(call.function.arguments)
        result = get_weather(**args)
        messages.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": json.dumps(result),
        })

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.runinfra.ai/v1",
  apiKey: "YOUR_RUNINFRA_API_KEY",
});

const tools = [{
  type: "function" as const,
  function: {
    name: "get_weather",
    description: "Get the current weather for a city",
    parameters: {
      type: "object",
      properties: { city: { type: "string" } },
      required: ["city"],
    },
  },
}];

function getWeather(city: string) {
  return { city, temp_c: 21, conditions: "partly cloudy" };
}

const messages: any[] = [
  { role: "user", content: "What's the weather in Paris?" },
];

while (true) {
  const response = await client.chat.completions.create({
    model: "llama-3.3-70b",
    messages,
    tools,
  });
  const msg = response.choices[0].message;
  messages.push(msg);

  if (!msg.tool_calls?.length) { console.log(msg.content); break; }

  for (const call of msg.tool_calls) {
    const args = JSON.parse(call.function.arguments);
    const result = getWeather(args.city);
    messages.push({
      role: "tool",
      tool_call_id: call.id,
      content: JSON.stringify(result),
    });
  }
}

What to tune

Parameter	Effect
`tool_choice: "auto"`	Model chooses when to call a tool (default)
`tool_choice: "required"`	Force a tool call every turn, no free-form response
`tool_choice: {type:"function", function:{name:"..."}}`	Force a specific tool
`parallel_tool_calls: false`	Disable parallel calls (default is true on capable models)

Common mistakes

Appending the assistant message incorrectly. When the model returns tool calls, push the entire assistant message (with tool_calls) to history, then push one tool role message per call. Dropping the assistant turn breaks the state.
Non-JSON-serializable tool results. The content on a tool-role message must be a string. Always json.dumps(...) your return value.
Model not calling tools. The description on the function matters. Write it as if the model has never seen the API before: what the tool does, what each arg means, example inputs.
Infinite loops. Add a max turn count around the while True loop. 10 turns is plenty for most patterns.
Argument validation. The model can hallucinate required fields. Validate with Pydantic or Zod before executing.

Next steps

Structured output

When you want JSON back but don’t need a tool loop.

RAG

Retrieval with tools as the search interface.

Streaming

Stream assistant tokens and tool calls.

OpenAI compatibility

The tool-calling contract in detail.

​Minimal code

​What to tune

​Common mistakes

​Next steps