DJLougen commited on
Commit
caf033d
·
verified ·
1 Parent(s): eb3dc88

README: --reasoning-format deepseek is default in recent llama.cpp; --jinja + --chat-template-file is enough

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -44,13 +44,13 @@ llama-server \
44
  -m Ornstein3.6-35B-A3B-RYS-SABER-BF16.gguf \
45
  --jinja \
46
  --chat-template-file chat_template.jinja \
47
- --reasoning-format deepseek \
48
  -ngl 99 -c 8192
49
  ```
50
 
51
  - `--jinja` enables jinja chat-template parsing.
52
  - `--chat-template-file chat_template.jinja` overrides whatever template is embedded in the GGUF with the correct Qwen3-Thinking one from this repo.
53
- - `--reasoning-format deepseek` makes llama-server split `<think>...</think>` out into a separate `reasoning_content` field on the response instead of leaving it inline in `content`. Without this flag the tags will still appear in the response body even with the right template.
 
54
 
55
  ## Support This Work
56
 
 
44
  -m Ornstein3.6-35B-A3B-RYS-SABER-BF16.gguf \
45
  --jinja \
46
  --chat-template-file chat_template.jinja \
 
47
  -ngl 99 -c 8192
48
  ```
49
 
50
  - `--jinja` enables jinja chat-template parsing.
51
  - `--chat-template-file chat_template.jinja` overrides whatever template is embedded in the GGUF with the correct Qwen3-Thinking one from this repo.
52
+
53
+ Recent llama.cpp builds default `--reasoning-format` to `deepseek`, which splits `<think>...</think>` out of the `content` field into a separate `reasoning_content` field on the OpenAI-compatible response — so just `--jinja --chat-template-file` is enough. If you're on an older build and still see raw `<think>` blocks in `content`, add `--reasoning-format deepseek` explicitly.
54
 
55
  ## Support This Work
56