Qwen3.6 35B A3B — Opus 4.6 Reasoning Distillation (Merged)

Fine-tuned version of Qwen/Qwen3.6-35B-A3B on high-quality reasoning traces distilled from Claude Opus 4.6.

This is the full merged BF16 model in safetensors format. For quantized GGUF versions ready to use with llama.cpp, see rico03/Qwen3.6-35B-Opus-Reasoning-GGUF.

Training Details

  • Base model: Qwen/Qwen3.6-35B-A3B (MoE, 3B active params)
  • Method: QLoRA (r=16, alpha=16, nf4)
  • Datasets: Crownelius/Opus-4.6-Reasoning-3300x + TeichAI/Claude-Opus-4.6-Reasoning-887x
  • Examples: ~3046 total
  • Epochs: 1
  • Final loss: ~0.64
  • Hardware: NVIDIA H100 NVL 94GB
  • Framework: TRL + PEFT (HuggingFace)

Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "rico03/qwen36-35B-opus-reasoning-merged",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "rico03/qwen36-35B-opus-reasoning-merged"
)

Convert to GGUF

git clone https://github.com/ggerganov/llama.cpp
pip install -r llama.cpp/requirements.txt

python3 llama.cpp/convert_hf_to_gguf.py ./qwen36-35B-opus-reasoning-merged \
    --outfile qwen36-opus-f16.gguf \
    --outtype f16

# Quantize
./llama.cpp/build/bin/llama-quantize qwen36-opus-f16.gguf qwen36-opus-Q3_K_S.gguf Q3_K_S
./llama.cpp/build/bin/llama-quantize qwen36-opus-f16.gguf qwen36-opus-Q2_K.gguf Q2_K

What improved

  • Structured reasoning with explicit thinking before answering
  • Better multi-step problem solving and agentic coding
  • More consistent response formatting
  • Improved mathematical and algorithmic reasoning
  • Better frontend and repository-level coding workflows

Hardware Requirements

Requires ~70GB VRAM or RAM for full precision inference. For consumer hardware use the GGUF version.

License

Apache 2.0 — same as base model.

Downloads last month
-
Safetensors
Model size
35B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rubertigno/Qwen3.6-35B-A3B-Opus-Reasoning

Adapter
(5)
this model

Datasets used to train Rubertigno/Qwen3.6-35B-A3B-Opus-Reasoning