README: --reasoning-format deepseek is default in recent llama.cpp; --jinja + --chat-template-file is enough
Browse files
README.md
CHANGED
|
@@ -44,13 +44,13 @@ llama-server \
|
|
| 44 |
-m Ornstein3.6-35B-A3B-RYS-SABER-BF16.gguf \
|
| 45 |
--jinja \
|
| 46 |
--chat-template-file chat_template.jinja \
|
| 47 |
-
--reasoning-format deepseek \
|
| 48 |
-ngl 99 -c 8192
|
| 49 |
```
|
| 50 |
|
| 51 |
- `--jinja` enables jinja chat-template parsing.
|
| 52 |
- `--chat-template-file chat_template.jinja` overrides whatever template is embedded in the GGUF with the correct Qwen3-Thinking one from this repo.
|
| 53 |
-
|
|
|
|
| 54 |
|
| 55 |
## Support This Work
|
| 56 |
|
|
|
|
| 44 |
-m Ornstein3.6-35B-A3B-RYS-SABER-BF16.gguf \
|
| 45 |
--jinja \
|
| 46 |
--chat-template-file chat_template.jinja \
|
|
|
|
| 47 |
-ngl 99 -c 8192
|
| 48 |
```
|
| 49 |
|
| 50 |
- `--jinja` enables jinja chat-template parsing.
|
| 51 |
- `--chat-template-file chat_template.jinja` overrides whatever template is embedded in the GGUF with the correct Qwen3-Thinking one from this repo.
|
| 52 |
+
|
| 53 |
+
Recent llama.cpp builds default `--reasoning-format` to `deepseek`, which splits `<think>...</think>` out of the `content` field into a separate `reasoning_content` field on the OpenAI-compatible response — so just `--jinja --chat-template-file` is enough. If you're on an older build and still see raw `<think>` blocks in `content`, add `--reasoning-format deepseek` explicitly.
|
| 54 |
|
| 55 |
## Support This Work
|
| 56 |
|