Reasoning

by bewilderbeast - opened 18 days ago

Has anyone managed to use reasoning with this model? I have not been able to get reasoning to work with any of my frontends, the output of the model does not contain any reasoning trace. I have included <|think|> at the beginning in my system prompts.

Here is the command I use to run the model:
docker run --rm --init --network=host --gpus all --ipc=host -v /var/llamamodels:/models --name vllm-gemma4 vllm-custom:latest --model /models/cyankiwi/gemma-4-31B-it-AWQ-8bit --port 8001 --served-model-name gemma-4-awq-vllm --reasoning-parser gemma4 --enable-auto-tool-choice --tool-call-parser gemma4 --tensor-parallel-size 1 --max-model-len 160000 --gpu-memory-utilization 0.75

vllm-custom is my docker image with transformers 5.60.dev. I have built it both on vllm-openai:nightly (which identifies as v0.18.2rc) and vllm-openai:v0.19.0.

Has anyone gotten reasoning working?

meganoob1337

16 days ago

•

edited 15 days ago

this is the instruct version without reasoning, you can identify it by the "-it-" part of the model name @bewilderbeast

edit: sorry for my misinformation I didn't read properly and was used to this nomenclature from the qwen think/instruct models

cpatonn

cyankiwi org 16 days ago

Please pass the tag <|think|> to system content i.e., {"role": "system", "content": "<|think|>"}, {"role": "user", "content": "Hey, how are you?"}, or pass chat_template_kwargs={"enable_thinking": True} to chat template.

This is the same with the full-precision model.

bewilderbeast

15 days ago

Thank you for responding. I had already set the <|think|> tag at the beginning of the system message, I additionally hat to set the chat_template_kargs, that finally helped. Thank you!

bewilderbeast changed discussion status to closed 15 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment