Qwen3.6 35B A3B — Opus 4.6 Reasoning Distillation (Merged)

Fine-tuned version of Qwen/Qwen3.6-35B-A3B on high-quality reasoning traces distilled from Claude Opus 4.6.

This is the full merged BF16 model in safetensors format. For quantized GGUF versions ready to use with llama.cpp, see rico03/Qwen3.6-35B-Opus-Reasoning-GGUF.

Training Details

Base model: Qwen/Qwen3.6-35B-A3B (MoE, 3B active params)
Method: QLoRA (r=16, alpha=16, nf4)
Datasets: Crownelius/Opus-4.6-Reasoning-3300x + TeichAI/Claude-Opus-4.6-Reasoning-887x
Examples: ~3046 total
Epochs: 1
Final loss: ~0.64
Hardware: NVIDIA H100 NVL 94GB
Framework: TRL + PEFT (HuggingFace)

Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "rico03/qwen36-35B-opus-reasoning-merged",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "rico03/qwen36-35B-opus-reasoning-merged"
)

Convert to GGUF

git clone https://github.com/ggerganov/llama.cpp
pip install -r llama.cpp/requirements.txt

python3 llama.cpp/convert_hf_to_gguf.py ./qwen36-35B-opus-reasoning-merged \
    --outfile qwen36-opus-f16.gguf \
    --outtype f16

# Quantize
./llama.cpp/build/bin/llama-quantize qwen36-opus-f16.gguf qwen36-opus-Q3_K_S.gguf Q3_K_S
./llama.cpp/build/bin/llama-quantize qwen36-opus-f16.gguf qwen36-opus-Q2_K.gguf Q2_K

What improved

Structured reasoning with explicit thinking before answering
Better multi-step problem solving and agentic coding
More consistent response formatting
Improved mathematical and algorithmic reasoning
Better frontend and repository-level coding workflows

Hardware Requirements

Requires ~70GB VRAM or RAM for full precision inference. For consumer hardware use the GGUF version.

License

Apache 2.0 — same as base model.

Downloads last month: -

Safetensors

Model size

35B params

Tensor type

BF16

Model tree for Rubertigno/Qwen3.6-35B-A3B-Opus-Reasoning

Base model

Qwen/Qwen3.6-35B-A3B

Adapter

(5)

this model

Duplicated from rico03/Qwen3.6-35B-A3B-Opus-Reasoning

Rubertigno
/

Qwen3.6-35B-A3B-Opus-Reasoning