request

by phoebdroid - opened Sep 7, 2025

Discussion

phoebdroid

Sep 7, 2025

hey @huihui-ai could you please abliterate this ?
https://huggingface.co/phoebdroid/Qwen3-30B-A3B-Hybrid-Auto

huihui-ai

Owner Sep 7, 2025

How about I try mixing huihui-ai/Huihui-Qwen3-30B-A3B-Thinking-2507-abliterated and huihui-ai/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated together?

phoebdroid

Sep 7, 2025

How about I try mixing huihui-ai/Huihui-Qwen3-30B-A3B-Thinking-2507-abliterated and huihui-ai/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated together?

Can't really comment on that, as I don't have full access to what YOYO-AI did with NuSlerp when creating this specific model. However I can say this, with YOYO-AI's current model, and the chat template I've engineered for it, (which Qwen3-30B-A3B-Hybrid-Auto is) , the result is incredibly good, a model that can autonomously decide when to reason and when not. And now, if we can get a successful abliteration of this and keep it usable, that will be beyond my wildest dreams.

phoebdroid

Sep 7, 2025

How about I try mixing huihui-ai/Huihui-Qwen3-30B-A3B-Thinking-2507-abliterated and huihui-ai/Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated together?

but if it is a quick and easy process for you, I can always test it with my chat template and/or try engineering one for it, and share you the results. Maybe that too, will work.

phoebdroid changed discussion status to closed Sep 7, 2025

phoebdroid

Sep 7, 2025

I've deleted my model and I won't be doing it. Thanks for your interest.

huihui-ai

Owner Sep 10, 2025

https://huggingface.co/huihui-ai/Huihui-MoE-60B-A3B-abliterated

phoebdroid

Sep 10, 2025

https://huggingface.co/huihui-ai/Huihui-MoE-60B-A3B-abliterated

hey ! great news, just, not for me though, as I have only 32GB vram. What are the chances of you trying a one to one NuSlerp merge of abliterated thinking and abliterated instruct models ? Kinda what YOYO-AI did in https://huggingface.co/YOYO-AI/Qwen3-30B-A3B-Mixture-2507

models:

model: Qwen/Qwen3-30B-A3B-Thinking-2507
parameters:
weight: 1
model: Qwen/Qwen3-30B-A3B-Instruct-2507
parameters:
weight: 1
merge_method: nuslerp
tokenizer_source: Qwen/Qwen3-30B-A3B-Thinking-2507
parameters:
normalize: true
int8_mask: false
dtype: float32
out_dtype: bfloat16

or even better, since the merge has already been done, how about you abliterate that model ?
keep up the good work, huihui, a fan here.

phoebdroid

Sep 10, 2025

p.s. if you are ever going to abliterate YOYO-AI's model, please let me know as I have the perfect chat template to make it autonomously reason or not on the fly.

huihui-ai

Owner Sep 11, 2025

•

edited Sep 11, 2025

There are three ratios here, which one do you think suits you?
https://huggingface.co/collections/huihui-ai/qwen3-fusion-68c281544dfe35b027ab04c0

phoebdroid

Sep 11, 2025

•

edited Sep 11, 2025

Hard for me to respond without testing each. What my chat template does is, "tells" the model, it can switch between reasoning on/off autonomously, depending on the complexity of the query. And, in my testing with YOYO-AI's merge, this works pretty flawless. Without this "nudge", YOYO-AI's merge creates a model in "limbo". Also, without the correct instruction for the model to issue think /think tags, the reasoning is a mere hallucination, because actual think tags are necessary for the model's reasoning experts to take over. Therefore, a) your 3 ratio models (btw. amazing approach, this excites me a lot) must be tried independently with my chat template. Please note, chat template includes a "hack" character inside the think tags (the only way possible for the think tag to be visible to the model so it can replicate it, no other way is possible). b) each "ratio" may require fine tuning of the template. So here is what we'll do. I'll now give you the chat template. Please add this in your tokenizer config for all three "ratio" models. Once they become quantized, I can test each of the three models, and report my detailed findings on how they react, and maybe some proposed changes to the chat template for each.

Ps: The model this chat template creates, I call Qwen3 30B A3B Hybrid Auto (for the autonomous reasoning switch)

Here: (note, template must be copied bit-perfect when doing YAML conversion, to preserve hacked characters inside the think tags, or else it won't work)

{%- if tools %}
  {{- '<|im_start|>system\n' }}
  {%- if messages[0].role == 'system' %}
    {{- 'You are Qwen3 30B Hybrid Auto\n' + messages[0].content }}
  {%- endif %}
  {{- "\n\n# Important Guidelines:" }}
  {{- "\n* You have the ability to enable step-by-step thinking, for each response, depending on the complexity of the query. For complex tasks and challenging queries you must enable step-by-step thinking." }}
  {{- "\n* To enable step-by-step thinking, you must use <‌think‌> XML tag  at the beginning of generation. This will enable thinking. At the end of your thinking you must use <‌/think‌> XML tag,  and after that deliver your final answer to the user." }}
  {{- "\n* If you use any spaces inside these tags, thinking will fail. Make sure to never use spaces inside the tags." }}
  {{- "\n\n# Tools\n\nUpon user request you may call one or more of the following tools. \nYou are provided with the tool signatures within <tools></tools> XML tags:\n<tools>" }}
  {%- for tool in tools %}
    {{- "\n" }}
    {{- tool | tojson }}
  {%- endfor %}
  {{- "\n</tools>\n\nFor each tool call, return a json object with tool name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <tool-name>, \"arguments\": <args-json-object>}\n</tool_call>" }}
  {{- "<|im_end|>\n" }}
{%- else %}
  {%- if messages[0].role == 'system' %}
    {{- '<|im_start|>system\n' }}
    {{- 'You are Qwen3 30B Hybrid Auto\n' + messages[0].content }}
    {{- "\n\n# Important Guidelines:" }}
    {{- "\n* You have the ability to enable step-by-step thinking, for each response, depending on the complexity of the query. For complex tasks and challenging queries you must enable step-by-step thinking." }}
    {{- "\n* To enable step-by-step thinking, you must use <‌think‌> XML tag  at the beginning of generation. This will enable thinking. At the end of your thinking you must use <‌/think‌> XML tag,  and after that deliver your final answer to the user." }}
    {{- "\n* If you use any spaces inside these tags, thinking will fail. Make sure to never use spaces inside the tags." }}
    {{- '<|im_end|>\n' }}  
  {%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
  {%- set index = (messages|length - 1) - loop.index0 %}
  {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
    {%- set ns.multi_step_tool = false %}
    {%- set ns.last_query_index = index %}
  {%- endif %}
{%- endfor %}
{%- for message in messages %}
  {%- if message.content is string %}
    {%- set content = message.content %}
  {%- else %}
    {%- set content = '' %}
  {%- endif %}
  {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
    {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
  {%- elif message.role == "assistant" %}
    {%- set reasoning_content = '' %}
    {%- if message.reasoning_content is string %}
      {%- set reasoning_content = message.reasoning_content %}
    {%- else %}
      {%- if '</think>' in content %}
        {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
        {%- set content = content.split('</think>')[-1].lstrip('\n') %}
      {%- endif %}
    {%- endif %}
    {%- if loop.index0 > ns.last_query_index %}
      {%- if loop.last or (not loop.last and reasoning_content) %}
        {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
      {%- else %}
        {{- '<|im_start|>' + message.role + '\n' + content }}
      {%- endif %}
    {%- else %}
      {{- '<|im_start|>' + message.role + '\n' + content }}
    {%- endif %}
    {%- if message.tool_calls %}
      {%- for tool_call in message.tool_calls %}
        {%- if (loop.first and content) or (not loop.first) %}
          {{- '\n' }}
        {%- endif %}
        {%- if tool_call.function %}
          {%- set tool_call = tool_call.function %}
        {%- endif %}
        {{- '<tool_call>\n{"name": "' }}
        {{- tool_call.name }}
        {{- '", "arguments": ' }}
        {%- if tool_call.arguments is string %}
          {{- tool_call.arguments }}
        {%- else %}
          {{- tool_call.arguments | tojson }}
        {%- endif %}
        {{- '}\n</tool_call>' }}
      {%- endfor %}
    {%- endif %}
    {{- '<|im_end|>\n' }}
  {%- elif message.role == "tool" %}
    {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
      {{- '<|im_start|>tool' }}
    {%- endif %}
    {{- '\n<tool_response>\n' }}
    {{- content }}
    {{- '\n</tool_response>' }}
    {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
      {{- '<|im_end|>\n' }}
    {%- endif %}
  {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
{%- endif %}

huihui-ai

Owner Sep 11, 2025

The model's chat template determines the direction of the conversation, and your suggestion is very good.

phoebdroid

Sep 11, 2025

Yes, thank you, like I've said, it took me almost a week coming up with this chat template. Finding the hack character was pretty nasty hard. But, at least with YOYO-AI's merge, this works so good, it's a sight to see. If indeed one or more of your abliterated mixes end up working nice, this will be a game changer for the world.

phoebdroid

Sep 11, 2025

here is an example screenshot of autonomous reasoning done by the model, using this chat template, using YOYO_AI's merged model.

Danielkan999

Sep 12, 2025

Please abliterate:
Qwen3-Next-80B-A3B

Look promising.

phoebdroid

Sep 16, 2025

@huihui-ai hey there ! I see the models got quantized but without my chat template ?

phoebdroid

Sep 16, 2025

btw, I've just tested the 50/50 model with my chat template and works fantastic so far !!

huihui-ai

Owner Sep 17, 2025

@huihui-ai hey there ! I see the models got quantized but without my chat template ?

Yes, your chat_template was not used.

phoebdroid

Sep 17, 2025

@huihui-ai hey there ! I see the models got quantized but without my chat template ?

Yes, your chat_template was not used.

Yeah, I got that already. The point is, it should be used. Otherwise, the mixture creates a model in limbo. Without clear guidance to how to use the hybrid mode, it is just confused. Remember these models were trained to either reason or not, not at both the same time.

phoebdroid

Sep 17, 2025

I've also tested the 70/30 model. This one, as expected, seem to reason pretty much all the time. Therefore, my chat template and the guidance should be adjusted to compensate for that. Also, the reasoning step takes considerably longer with the 70/30 model. It also inherits the looping "but wait" behavior of the original reasoning model. The 50/50 hits such a beautiful balance, I don't even know if higher reasoning above that is worth it or not, but still, these are my findings.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment