Qwen3.5-9B-Antirep
A merged Qwen3.5-9B model with an anti-repetition DPO (Direct Preference Optimization) adapter baked in. This model reduces the tendency of Qwen3.5-9B to fall into repetitive generation loops, particularly during long-form outputs.
Inspired by ConicCat/Qwen3.5-Antirep-27B.
Training Details
- Base model: Qwen/Qwen3.5-9B
- Method: QLoRA DPO training, then merged into the base model
- Training data: 481 on-policy preference pairs (chosen = clean completions with thinking traces, rejected = degenerate repetitive loops)
- Categories: general (181), reasoning (106), code (93), math (73), safety (28)
- LoRA config: r=32, alpha=16, RSLoRA, targeting all attention + MLP modules
- DPO config: beta=0.1, sigmoid loss
- Training: 3 epochs, 363 steps, ~48 minutes on a single RTX 3090
- Sequence length: 768 tokens (max_prompt=256, max_completion=512)
- Optimizer: paged_adamw_8bit, lr=5e-6, cosine schedule, warmup_ratio=0.1
Training Metrics
| Epoch | Avg Loss | Reward Margins | Accuracy |
|---|---|---|---|
| 1 | ~0.065 | ~5 | 100% |
| 2 | ~0.001 | ~10 | 100% |
| 3 | ~0.002 | ~8-13 | 100% |
Evaluation
Tested on prompts that previously triggered repetitive outputs from the base model:
| Model | Repetition Rate | Avg Repetition Score |
|---|---|---|
| Base Qwen3.5-9B | 10% | 0.033 |
| This model | 0% | 0.010 |
The adapter reduces sub-threshold repetition scores by ~50-70% and eliminates hard repetition loops.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("ToastyPigeon/Qwen3.5-9B-Antirep")
tokenizer = AutoTokenizer.from_pretrained("ToastyPigeon/Qwen3.5-9B-Antirep")
Architecture
Qwen3.5-9B is a hybrid architecture with 32 layers:
- 24 GDN (Gated Delta Net / linear attention) layers
- 8 standard full attention layers (every 4th layer)
- 9B parameters, 248K vocabulary, bf16
The LoRA adapter targeted all attention projections (GDN: in_proj_qkv, in_proj_z, in_proj_a, in_proj_b, out_proj; standard: q/k/v/o_proj) and all MLP projections (gate/up/down_proj).
- Downloads last month
- 37