# Open WebUI Compatibility

Which models work with Open WebUI for tool calling, and why some don't.

## Compatibility Matrix

| Model | VLLM API | Open WebUI | Notes |
|-------|----------|-----------|-------|
| Hermes-3-Llama-3.1-70B | Yes | **No** | Format incompatible |
| Llama-3.3-70B-Instruct | Yes | Yes | Works out of the box |
| Qwen2-72B-Instruct | Yes | Yes | Works with hermes parser |
| Mistral-Nemo-12B | Yes | Yes | Works with mistral parser |

## Why Hermes-3 Doesn't Work with Open WebUI

Open WebUI expects tool calls in the standard OpenAI JSON format:

```json
{
  "tool_calls": [{
    "id": "call_abc123",
    "type": "function",
    "function": {
      "name": "get_weather",
      "arguments": "{\"location\": \"SF\"}"
    }
  }]
}
```

Hermes-3's native format uses ChatML + XML tags:

```xml
<tool_call>
{"name": "get_weather", "arguments": {"location": "SF"}}
</tool_call>
```

VLLM's `--tool-call-parser hermes` converts between these formats, but Open WebUI's tool execution pipeline has additional requirements that the conversion doesn't fully satisfy.

## The Flow

```
Working (Llama 3.3):
  Model → Native JSON format → VLLM parser → OpenAI format → Open WebUI ✅

Broken (Hermes-3):
  Model → ChatML+XML format → VLLM parser → OpenAI format → Open WebUI ❌
                                                              (format mismatch
                                                               in execution)
```

## Recommendations

### If you need Open WebUI:

Use **Llama-3.3-70B-Instruct-FP8** — it works immediately with no configuration beyond:

```bash
--tool-call-parser llama3_json
--enable-auto-tool-choice
```

### If you're building a custom application:

Use **Hermes-3** — it has the best tool calling quality and all formats work via the VLLM API.

### If you need both:

Run two VLLM instances:
- Hermes-3 on port 8000 for your custom application
- Llama-3.3 on port 8001 for Open WebUI

Both fit on a 96GB GPU simultaneously (if using smaller context windows or if you have multi-GPU).

## Open WebUI Setup for Llama 3.3

1. **Start VLLM:**
   ```bash
   ./configs/llama33_70b_fp8.sh
   ```

2. **Add connection in Open WebUI:**
   - Settings → Connections → OpenAI API
   - URL: `http://your-gpu-server:8000/v1`
   - API Key: (leave empty or use "none")

3. **Enable tools:**
   - Settings → Tools → Enable
   - Add your tool definitions

4. **Test:**
   - Start a new chat with Llama-3.3-70B-Instruct-FP8
   - Ask a question that requires tool use
   - Verify tool calls appear and execute