Tool-calling schema compliance check (repro steps + observed output)

#9
by lesj0610 - opened

Hello, Iโ€™m trying to verify whether this model supports schema-based tool calling (strict tool name + arguments matching a predefined schema), rather than generating API-usage explanations.

Repro steps (minimal):

  1. Provide the model a single tool schema (OpenAI-style function calling):
{
  "name": "get_current_weather",
  "description": "Get the current weather for a given city",
  "parameters": {
    "type": "object",
    "properties": {
      "location": { "type": "string", "description": "City name with country code (e.g., Seoul,KR)" },
      "unit": { "type": "string", "enum": ["metric", "imperial"] }
    },
    "required": ["location"]
  }
}
  1. Ask the model in natural Korean (no mention of tools/APIs/functions):

์ง€๊ธˆ ์„œ์šธ์˜ ํ˜„์žฌ ๊ธฐ์˜จ์ด ๋ช‡ ๋„์ธ๊ฐ€์š”?

Expected output (example of a correct schema-compliant tool call):

{
  "name": "get_current_weather",
  "arguments": { "location": "Seoul,KR", "unit": "metric" }
}

Observed output (raw, unchanged):

๋‚ ์”จ ์ •๋ณด์— ์ ‘๊ทผ์ด ๊ฐ€๋Šฅํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•  ๋•Œ, ์„œ์šธ์˜ ํ˜„์žฌ ๊ธฐ์˜จ์„ ์กฐํšŒํ•˜๊ธฐ ์œ„ํ•œ JSON ํ˜•์‹์˜ ๋„๊ตฌ ํ˜ธ์ถœ ์–‘์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์ผ๋ฐ˜์ ์ธ ๋‚ ์”จ API ์š”์ฒญ ํ˜•์‹(์˜ˆ: OpenWeatherMap API ๊ธฐ๋ฐ˜)์„ ์ฐธ๊ณ ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

{
  "method": "GET",
  "endpoint": "https://api.openweathermap.org/data/2.5/weather",
  "query_params": {
    "q": "Seoul,KR",
    "appid": "YOUR_API_KEY",
    "units": "metric",
    "lang": "kr"
  }
}

์„ค๋ช…:
method: HTTP ์š”์ฒญ ๋ฉ”์„œ๋“œ (GET)
endpoint: ๋‚ ์”จ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๋Š” API ์—”๋“œํฌ์ธํŠธ
query_params:
q: ๋„์‹œ ๋ฐ ๊ตญ๊ฐ€ ์ฝ”๋“œ (Seoul,KR)
appid: ๋ฐœ๊ธ‰๋ฐ›์€ OpenWeatherMap API ํ‚ค (์‹ค์ œ ์‚ฌ์šฉ ์‹œ ๋ณธ์ธ์˜ ํ‚ค๋กœ ๊ต์ฒด ํ•„์š”)
units: ์„ญ์”จ ์˜จ๋„ ์‚ฌ์šฉ์„ ์œ„ํ•ด metric ์ง€์ •
lang: ์‘๋‹ต ์–ธ์–ด๋ฅผ ํ•œ๊ตญ์–ด๋กœ ์„ค์ •ํ•˜๋ ค๋ฉด lang=kr (ํ•œ๊ตญ์–ด ์ง€์› ์—ฌ๋ถ€์— ๋”ฐ๋ผ ์ƒ๋žต ๊ฐ€๋Šฅ)

โš ๏ธ ์ฃผ์˜: ์‹ค์ œ ์‹คํ–‰์„ ์œ„ํ•ด์„œ๋Š” ์œ ํšจํ•œ appid๊ฐ€ ํ•„์š”ํ•˜๋ฉฐ, ์œ„ JSON์€ ์š”์ฒญ ๊ตฌ์กฐ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๋„๊ตฌ ํ˜ธ์ถœ ์–‘์‹์ž…๋‹ˆ๋‹ค. ์ €๋Š” ์ง์ ‘ API๋ฅผ ํ˜ธ์ถœํ•  ์ˆ˜ ์—†์œผ๋ฏ€๋กœ, ์ด ํ˜•์‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์™ธ๋ถ€ ์‹œ์Šคํ…œ์ด๋‚˜ ๋ฐฑ์—”๋“œ์—์„œ ์‹คํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์‘๋‹ต ์˜ˆ์‹œ (์„ฑ๊ณต ์‹œ ์ผ๋ถ€):
{
  "main": {
    "temp": 18.5,
    "feels_like": 17.8,
    "humidity": 65
  },
  "weather": [
    {
      "description": "ํฉ์–ด์ง„ ๊ตฌ๋ฆ„",
      "icon": "03d"
    }
  ],
  "name": "Seoul"
}

์ด ๊ตฌ์กฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‹ค์ œ ๋‚ ์”จ API์—์„œ ์„œ์šธ์˜ ํ˜„์žฌ ๊ธฐ์˜จ(์˜ˆ: 18.5ยฐC)์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•„์š”ํ•˜์‹œ๋ฉด Python ์ฝ”๋“œ ์˜ˆ์‹œ๋„ ์ œ๊ณตํ•ด ๋“œ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

From this result, the model appears to generate an API request example rather than emitting a schema-compliant tool call using the provided tool name and argument schema.

Could you clarify:

  1. Was schema-level tool calling included in training for this model?
  2. If yes, is there a required serving configuration or prompt format to enable strict tool-call emission?

Thank you.

LG AI Research org

Hello, @lesj0610 . Thank you for your attention.

We checked your case but could not reproduce the same failure.

Here is the snippet we tested:

tools = [{
    "name": "get_current_weather",
    "description": "Get the current weather for a given city",
    "parameters": {
        "type": "object",
            "properties": {
                "location": { "type": "string", "description": "City name with country code (e.g., Seoul,KR)" },
                "unit": { "type": "string", "enum": ["metric", "imperial"] }
            },
        "required": ["location"]
    }
}]

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "์ง€๊ธˆ ์„œ์šธ์˜ ํ˜„์žฌ ๊ธฐ์˜จ์ด ๋ช‡ ๋„์ธ๊ฐ€์š”?"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True,
    tools=tools,
    chat_template_kwargs={"enable_thinking": False},
)

output = model.generate(
    **inputs,
    do_sample=True,
    temperature=1.0,
    top_p=0.95,
    max_new_tokens=256,
)

Please note that the tools schema should be passed to the HF tokenizer, not directly to model.generate(), when using a tool-augmented chat template.
(For inference server, please check the documentation of using OpenAI-compatible tool calling. )

With this setup, we get the following tool call:

<tool_call>{"name": "get_current_weather", "arguments": {"location": "Seoul,KR", "unit": "metric"}}</tool_call>

It seems the tool schema might not be passed to model in proper way.
If you can share the generation part of your code, we can take a closer look.

Thanks for the clarification and the working snippet.

One key detail: my testing was done only via the FriendliAI web UI (I cannot run the 236B model locally). In that UI, I do not have any way to verify whether the tools schema is actually injected into the model inputs (i.e., whether the server uses a tool-augmented chat template like tokenizer.apply_chat_template(..., tools=tools, ...)).

So the failure I observed (API-style JSON explanation instead of a tool call) may be due to the FriendliAI web UI pipeline not passing tools/schema, rather than the base model lacking tool-call training.

Could you confirm whether the FriendliAI web UI currently supports tool calling with schema injection?
If yes, could you share the minimal steps/settings in the UI to enable it, or provide an OpenAI-compatible endpoint example request that triggers tool calls?

If the web UI does not inject tools/schema, then my observation is expected from the deployment, and I will update my report accordingly.

Thank you.

Thanks for the clarification. I was testing via the FriendliAI web UI, where I couldnโ€™t verify whether the tools schema was actually injected. After switching to the FriendliAI OpenAI-compatible API and explicitly passing tools (and tool_choice), the model correctly returned finish_reason: "tool_calls" with a valid get_current_weather tool call. Iโ€™ll update my report accordingly.

Sign up or log in to comment