DJLougen commited on
Commit
2d30b6d
·
verified ·
1 Parent(s): 0b3d12d

Unsloth Model Card

Browse files
Files changed (1) hide show
  1. README.md +13 -159
README.md CHANGED
@@ -1,167 +1,21 @@
1
  ---
2
- language:
3
- - en
4
- license: apache-2.0
5
- library_name: transformers
6
  tags:
7
- - reasoning
8
- - qwen3.5
9
- - conversational
10
  - unsloth
11
- - self-correction
12
- - chain-of-thought
13
- - speculative-decoding
14
- base_model: unsloth/Qwen3.5-27B
15
- pipeline_tag: text-generation
16
  ---
17
 
18
- # Harmonic-27B
19
-
20
- ![Harmonic-27B](harmonic27B.jpeg)
21
-
22
- The flagship of the Harmonic family. A reasoning-focused fine-tune of [Qwen 3.5 27B](https://huggingface.co/unsloth/Qwen3.5-27B) trained on structurally validated data where every row passes automated quality gates. No junk, no filler, no shallow traces.
23
-
24
- Scales the same proven training approach from [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) to 27B parameters. Pairs with [Harmonic-2B](https://huggingface.co/DJLougen/Harmonic-2B) as a draft model for speculative decoding.
25
-
26
- ## The Harmonic Family
27
-
28
- | Model | Parameters | Role |
29
- |---|---|---|
30
- | [Harmonic-2B](https://huggingface.co/DJLougen/Harmonic-2B) | 2.3B | Draft model for speculative decoding |
31
- | [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) | 9.65B | Mid-range reasoning backbone |
32
- | [Harmonic-Hermes-9B](https://huggingface.co/DJLougen/Harmonic-Hermes-9B) | 9.65B | Stage 2 agentic variant (tool calling) |
33
- | **Harmonic-27B** | **27B** | **Flagship reasoning model** |
34
-
35
- All models share the same training data and reasoning format, enabling speculative decoding across the family with high acceptance rates.
36
-
37
- ## Training Approach
38
-
39
- Same pipeline as Harmonic-9B. **799 curated rows** - a small, precisely curated dataset instead of tens of thousands of unfiltered examples. The base model already has the knowledge from pretraining - the fine-tune teaches it a reasoning behavior pattern.
40
-
41
- Every training row contains explicit self-correction ("wait, that's not right"), verification ("let me check by plugging back in"), and multi-path exploration ("alternatively, I could try..."). The data was generated from multiple frontier models and filtered through a custom structural quality pipeline that enforces reasoning depth, coherence, and flow patterns. 100% of rows pass all quality gates simultaneously.
42
-
43
-
44
- ## Training Data Quality
45
-
46
- Curated using a custom structural process supervision pipeline:
47
-
48
- | Metric | Value |
49
- |---|---|
50
- | Signal quality score | 78.7 mean (61.5 min, 90.0 max) |
51
- | Thinking trace depth | 1,667 words average |
52
- | Self-correction | 100% of rows (17.2 per row avg) |
53
- | Verification | 100% of rows (10.3 per row avg) |
54
- | Exploration | 100% of rows (6.3 per row avg) |
55
- | Quality gate pass rate | 100% |
56
-
57
- ## How It Compares
58
-
59
- The same structural quality analysis run against every major public reasoning dataset:
60
-
61
- | Dataset | Rows | Think Words | Self-Correction | Verification | Exploration | Signal Score | Gate Pass |
62
- |---|---|---|---|---|---|---|---|
63
- | **Harmonic (ours)** | **799** | **1,667** | **100%** | **100%** | **100%** | **78.7** | **100%** |
64
- | Crownelius/Opus-3300x | 2,160 | 188 | 5.9% | 22.6% | 5.2% | 28.0 | 0.1% |
65
- | nohurry/Opus-Filtered | 2,326 | 191 | 6.7% | 24.1% | 5.3% | 28.5 | 0.1% |
66
- | TeichAI/Opus-250x | 250 | 323 | 17.2% | 26.8% | 6.8% | 24.6 | 0.4% |
67
- | Jackrong/Qwen-700x | 633 | 6,653 | 97.5% | 97.6% | 69.8% | 75.6 | 22.7% |
68
- | Bespoke-Stratos-17k | 16,710 | 1,322 | 88.2% | 72.7% | 59.7% | 71.7 | 49.0% |
69
- | glaiveai/reasoning-20m | 22M+ | 799 | 64.1% | 41.4% | 37.3% | 46.2 | 12.8% |
70
-
71
- ## Training Configuration
72
-
73
- ```
74
- base_model: unsloth/Qwen3.5-27B
75
- dataset: 799 curated reasoning rows
76
- epochs: 1
77
- learning_rate: 1e-4
78
- lr_scheduler: cosine
79
- warmup_ratio: 0.1
80
- max_seq_length: 8192
81
- lora_rank: 32
82
- lora_alpha: 32
83
- dropout: 0.05
84
- micro_batch_size: 1
85
- gradient_accumulation_steps: 4
86
- weight_decay: 0.01
87
- ```
88
-
89
- ## Usage
90
-
91
- ```python
92
- from transformers import AutoModelForCausalLM, AutoTokenizer
93
-
94
- model = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-27B")
95
- tokenizer = AutoTokenizer.from_pretrained("DJLougen/Harmonic-27B")
96
- ```
97
-
98
- ### With speculative decoding (Harmonic-2B as draft)
99
-
100
- ```python
101
- from transformers import AutoModelForCausalLM
102
-
103
- target = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-27B")
104
- draft = AutoModelForCausalLM.from_pretrained("DJLougen/Harmonic-2B")
105
-
106
- outputs = target.generate(
107
- **inputs,
108
- assistant_model=draft,
109
- max_new_tokens=512,
110
- )
111
- ```
112
-
113
- ### Reasoning format
114
-
115
- The model uses think blocks for reasoning:
116
-
117
- ```
118
- <|thinking|>
119
- The user is asking about X. Let me consider two approaches...
120
-
121
- Approach 1: ...
122
- Approach 2: ...
123
-
124
- I will go with Approach 1 because...
125
-
126
- Wait, I need to be careful here - this assumes Y, which may not hold.
127
- Let me verify by checking a special case...
128
-
129
- Yes, that confirms the result.
130
- <|/thinking|>
131
-
132
- [Final answer here]
133
- ```
134
-
135
- ## Intended Use
136
-
137
- - Complex reasoning tasks requiring deep multi-step thinking
138
- - Mathematical problem-solving with self-correction and verification
139
- - Code analysis, generation, and debugging with structured reasoning
140
- - General conversation (conversational ability preserved through training design)
141
- - Base model for Stage 2 agentic fine-tuning (Harmonic-Hermes-27B)
142
- - Target model for speculative decoding with Harmonic-2B
143
-
144
- ## Limitations
145
-
146
- - 27B parameters - requires significant compute (single A100 80GB or equivalent)
147
- - Reasoning traces can be verbose for simple questions
148
- - Not optimized for tool calling - agentic Stage 2 variant planned
149
- - Benchmark evaluation is ongoing
150
-
151
- ## Architecture
152
-
153
- - **Base**: Qwen 3.5 27B
154
- - **Training**: LoRA fine-tuning, merged into base weights
155
- - **Precision**: BF16
156
- - **Context**: 8192 tokens
157
-
158
- ## License
159
 
160
- Apache 2.0 - same as the base model. All training data is from Apache 2.0 or MIT licensed sources. Fully commercial use permitted.
 
 
161
 
162
- ## Links
163
 
164
- - Draft model: [DJLougen/Harmonic-2B](https://huggingface.co/DJLougen/Harmonic-2B)
165
- - 9B variant: [DJLougen/Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B)
166
- - 9B GGUF: [DJLougen/Harmonic-9B-GGUF](https://huggingface.co/DJLougen/Harmonic-9B-GGUF)
167
- - Agentic 9B: [DJLougen/Harmonic-Hermes-9B](https://huggingface.co/DJLougen/Harmonic-Hermes-9B)
 
1
  ---
2
+ base_model: unsloth/Qwen3.5-27B
 
 
 
3
  tags:
4
+ - text-generation-inference
5
+ - transformers
 
6
  - unsloth
7
+ - qwen3_5
8
+ license: apache-2.0
9
+ language:
10
+ - en
 
11
  ---
12
 
13
+ # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
+ - **Developed by:** DJLougen
16
+ - **License:** apache-2.0
17
+ - **Finetuned from model :** unsloth/Qwen3.5-27B
18
 
19
+ This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)