DJLougen commited on
Commit
0b1aa35
·
verified ·
1 Parent(s): 00e3e62

Match GGUF model card to main Harmonic-9B card with images and full content

Browse files
.gitattributes CHANGED
@@ -41,3 +41,4 @@ Harmonic-9B-F16.gguf filter=lfs diff=lfs merge=lfs -text
41
  Harmonic-9B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
42
  Harmonic-9B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
43
  Harmonic-9B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 
 
41
  Harmonic-9B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
42
  Harmonic-9B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
43
  Harmonic-9B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
44
+ training_quality.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -10,6 +10,7 @@ tags:
10
  - self-correction
11
  - llama.cpp
12
  - unsloth
 
13
  base_model: DJLougen/Harmonic-9B
14
  ---
15
 
@@ -23,7 +24,11 @@ base_model: DJLougen/Harmonic-9B
23
 
24
  GGUF quantizations of [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) for local inference with llama.cpp, Ollama, LM Studio, and other GGUF-compatible runtimes.
25
 
26
- Harmonic-9B is a reasoning-focused fine-tune of Qwen 3.5 9B trained on structurally validated data where every row passes automated quality gates. See the [full model card](https://huggingface.co/DJLougen/Harmonic-9B) for training details, data quality analysis, and pipeline documentation.
 
 
 
 
27
 
28
  ## Available Quantizations
29
 
@@ -44,12 +49,90 @@ Harmonic-9B is a reasoning-focused fine-tune of Qwen 3.5 9B trained on structura
44
  | `Harmonic-9B-IQ4_XS.gguf` | IQ4_XS | 4.3 | ~4.9 GB | Smallest 4-bit, importance matrix |
45
  | `Harmonic-9B-Q3_K_M.gguf` | Q3_K_M | 3.9 | ~4.6 GB | Smallest footprint, some quality loss |
46
 
47
- ## Recommended Quant
48
 
49
  **Q5_K_M** for most users - fits in 8GB VRAM with room for context, minimal quality degradation on reasoning tasks.
50
 
51
  **Q8_0** if you have the VRAM - preserves the full reasoning depth that the model was trained for.
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ## Usage
54
 
55
  ### Ollama
@@ -68,28 +151,57 @@ ollama run DJLougen/Harmonic-9B-GGUF
68
 
69
  Download any quantization and load in LM Studio. The model follows standard ChatML formatting.
70
 
71
- ## What Makes This Model Different
72
 
73
- Harmonic-9B was trained with a focus on structural reasoning quality over data volume:
74
 
75
- - Deep reasoning with self-correction, verification, and exploration in every training row (100% quality gate pass rate)
76
- - 1,817 curated rows following the Less Is More hypothesis - precision over volume
 
77
 
78
- For agentic tool calling, see Harmonic-Hermes-9B (coming soon).
 
79
 
80
- ## Format
81
 
82
- The model uses `<think>` blocks for reasoning. See the [full model card](https://huggingface.co/DJLougen/Harmonic-9B) for format examples.
 
83
 
84
- ### Vision (Multimodal)
 
85
 
86
- This model includes `Harmonic-9B-BF16-mmproj.gguf` - the vision projector for multimodal inference. Use with llama.cpp's `--mmproj` flag for image understanding tasks.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
 
88
  ## License
89
 
90
- Apache 2.0 - fully commercial use permitted.
91
 
92
  ## Links
93
 
94
- - Full model: [DJLougen/Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B)
95
- - Stage 2 dataset: [DJLougen/hermes-agent-traces-filtered](https://huggingface.co/datasets/DJLougen/hermes-agent-traces-filtered)
 
 
 
10
  - self-correction
11
  - llama.cpp
12
  - unsloth
13
+ - conversational
14
  base_model: DJLougen/Harmonic-9B
15
  ---
16
 
 
24
 
25
  GGUF quantizations of [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) for local inference with llama.cpp, Ollama, LM Studio, and other GGUF-compatible runtimes.
26
 
27
+ A reasoning-focused fine-tune of [Qwen 3.5 9B](https://huggingface.co/Qwen/Qwen3.5-9B) trained on structurally validated data where every row passes automated quality gates. No junk, no filler, no shallow traces.
28
+
29
+ The name comes from harmonic analysis of reasoning patterns - the structural signal that separates genuine thinking from surface-level chain-of-thought.
30
+
31
+ For the agentic tool-calling variant, see [Harmonic-Hermes-9B](https://huggingface.co/DJLougen/Harmonic-Hermes-9B) (coming soon) - a Stage 2 fine-tune of this model on quality-filtered agent traces from [DJLougen/hermes-agent-traces-filtered](https://huggingface.co/datasets/DJLougen/hermes-agent-traces-filtered).
32
 
33
  ## Available Quantizations
34
 
 
49
  | `Harmonic-9B-IQ4_XS.gguf` | IQ4_XS | 4.3 | ~4.9 GB | Smallest 4-bit, importance matrix |
50
  | `Harmonic-9B-Q3_K_M.gguf` | Q3_K_M | 3.9 | ~4.6 GB | Smallest footprint, some quality loss |
51
 
52
+ ### Recommended Quant
53
 
54
  **Q5_K_M** for most users - fits in 8GB VRAM with room for context, minimal quality degradation on reasoning tasks.
55
 
56
  **Q8_0** if you have the VRAM - preserves the full reasoning depth that the model was trained for.
57
 
58
+ ### Vision (Multimodal)
59
+
60
+ This model includes `Harmonic-9B-BF16-mmproj.gguf` - the vision projector for multimodal inference. Use with llama.cpp's `--mmproj` flag for image understanding tasks.
61
+
62
+ ## Training Approach
63
+
64
+ ![Pipeline](pipeline.png)
65
+
66
+ **1,817 curated rows.** That's it. Following the [LIMO hypothesis](https://huggingface.co/papers/2502.03387) (Less Is More for Reasoning), Harmonic uses a small, precisely curated dataset instead of tens of thousands of unfiltered examples. The base model already has the knowledge from pretraining - the fine-tune teaches it a reasoning behavior pattern.
67
+
68
+ Every training row contains explicit self-correction ("wait, that's not right"), verification ("let me check by plugging back in"), and multi-path exploration ("alternatively, I could try..."). The data was generated from multiple frontier models and filtered through a custom structural quality pipeline that enforces reasoning depth, coherence, and flow patterns. 100% of rows pass all quality gates simultaneously.
69
+
70
+ A small set of everyday conversation data is mixed in to preserve the base model's conversational ability - calibrated by token ratio analysis to prevent the reasoning data from drowning out conversational patterns during training.
71
+
72
+ ## Training Data Quality
73
+
74
+ ![Training Quality](training_quality.png)
75
+
76
+ The reasoning data was curated using a custom structural process supervision pipeline. Key metrics:
77
+
78
+ | Metric | Value |
79
+ |---|---|
80
+ | Signal quality score | 78.7 mean (61.5 min, 90.0 max) |
81
+ | Thinking trace depth | 1,667 words average |
82
+ | Self-correction | 100% of rows (17.2 per row avg) |
83
+ | Verification | 100% of rows (10.3 per row avg) |
84
+ | Exploration | 100% of rows (6.3 per row avg) |
85
+ | Quality gate pass rate | 100% |
86
+
87
+ Every row was scored across multiple structural dimensions and only rows passing all thresholds simultaneously were included. No rows were manually curated - the pipeline is fully automated and reproducible.
88
+
89
+ ## How It Compares
90
+
91
+ ![Competitor Comparison](competitor_comparison.png)
92
+
93
+ We ran our structural quality analysis against every major public reasoning dataset used for Opus/Qwen distillation. The results:
94
+
95
+ | Dataset | Rows | Think Words | Self-Correction | Verification | Exploration | Signal Score | Gate Pass |
96
+ |---|---|---|---|---|---|---|---|
97
+ | **Harmonic (ours)** | **1,817** | **1,667** | **100%** | **100%** | **100%** | **78.7** | **100%** |
98
+ | Crownelius/Opus-3300x | 2,160 | 188 | 5.9% | 22.6% | 5.2% | 28.0 | 0.1% |
99
+ | nohurry/Opus-Filtered | 2,326 | 191 | 6.7% | 24.1% | 5.3% | 28.5 | 0.1% |
100
+ | TeichAI/Opus-250x | 250 | 323 | 17.2% | 26.8% | 6.8% | 24.6 | 0.4% |
101
+ | Jackrong/Qwen-700x | 633 | 6,653 | 97.5% | 97.6% | 69.8% | 75.6 | 22.7% |
102
+ | Bespoke-Stratos-17k | 16,710 | 1,322 | 88.2% | 72.7% | 59.7% | 71.7 | 49.0% |
103
+ | glaiveai/reasoning-20m | 22M+ | 799 | 64.1% | 41.4% | 37.3% | 46.2 | 12.8% |
104
+ | KingNish/reasoning-20k | 19,944 | 132 | 0.7% | 4.2% | 4.3% | 27.4 | 0.0% |
105
+
106
+ The popular Opus distillation datasets (Crownelius, nohurry, TeichAI) have less than 1% quality gate pass rate. Their thinking traces average under 200 words with near-zero self-correction. Models trained on this data learn to produce short, shallow chain-of-thought that looks like reasoning but lacks the structural behaviors that make reasoning reliable.
107
+
108
+ Jackrong and Stratos are closer competitors but still fall short on consistency. Jackrong has massive traces (6,653 words avg) but only 22.7% pass the quality gate - the thinking is verbose but wanders. Stratos has decent markers but 49% of rows still fail, meaning half the gradient updates during training push the model toward shallow patterns.
109
+
110
+ Harmonic's data is smaller by design. Every row passes. Every gradient update reinforces genuine reasoning behavior.
111
+
112
+ ## Reasoning Flow
113
+
114
+ ![Reasoning Flow](reasoning_flow.png)
115
+
116
+ Marker density measured across 20 equal segments of each thinking trace. The characteristic curve shows reasoning intensity building through the middle of the trace and peaking in the later segments as the model enters verification and self-correction before committing to an answer.
117
+
118
+ ## Training Configuration
119
+
120
+ ```
121
+ base_model: Qwen/Qwen3.5-9B
122
+ dataset: 1,459 reasoning + 358 conversation rows
123
+ epochs: 1
124
+ learning_rate: 1e-4
125
+ lr_scheduler: cosine
126
+ warmup_ratio: 0.1
127
+ max_seq_length: 8192
128
+ lora_rank: 32
129
+ lora_alpha: 32
130
+ dropout: 0.05
131
+ micro_batch_size: 1
132
+ gradient_accumulation_steps: 4
133
+ weight_decay: 0.01
134
+ ```
135
+
136
  ## Usage
137
 
138
  ### Ollama
 
151
 
152
  Download any quantization and load in LM Studio. The model follows standard ChatML formatting.
153
 
154
+ ### Reasoning format
155
 
156
+ The model uses `<think>` blocks for reasoning:
157
 
158
+ ```
159
+ <think>
160
+ The user is asking about X. Let me consider two approaches...
161
 
162
+ Approach 1: ...
163
+ Approach 2: ...
164
 
165
+ I'll go with Approach 1 because...
166
 
167
+ Wait, I need to be careful here - this assumes Y, which may not hold.
168
+ Let me verify by checking a special case...
169
 
170
+ Yes, that confirms the result.
171
+ </think>
172
 
173
+ [Final answer here]
174
+ ```
175
+
176
+ ## Intended Use
177
+
178
+ - Reasoning tasks requiring genuine multi-step thinking
179
+ - Mathematical problem-solving with self-correction
180
+ - Code analysis and generation with structured verification
181
+ - General conversation (conversational ability preserved through training design)
182
+ - Base model for Stage 2 agentic fine-tuning
183
+
184
+ ## Limitations
185
+
186
+ - 9B parameter model - not suitable for tasks requiring extensive world knowledge
187
+ - Reasoning traces can be verbose for simple questions
188
+ - Not optimized for tool calling - see Harmonic-Hermes-9B (coming soon) for agentic use
189
+ - Benchmark evaluation is ongoing
190
+
191
+ ## Architecture
192
+
193
+ - **Base**: Qwen 3.5 9B (9.65B parameters)
194
+ - **Training**: LoRA fine-tuning, merged into base weights
195
+ - **Precision**: BF16
196
+ - **Context**: 8192 tokens
197
 
198
  ## License
199
 
200
+ Apache 2.0 - same as the base model. All training data is from Apache 2.0 or MIT licensed sources. Fully commercial use permitted.
201
 
202
  ## Links
203
 
204
+ - Full model weights: [DJLougen/Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B)
205
+ - Agentic variant: Harmonic-Hermes-9B (coming soon)
206
+ - Filtered agent dataset: [DJLougen/hermes-agent-traces-filtered](https://huggingface.co/datasets/DJLougen/hermes-agent-traces-filtered)
207
+ - LIMO paper: [Less is More for Reasoning](https://huggingface.co/papers/2502.03387)
competitor_comparison.png ADDED
pipeline.png ADDED
reasoning_flow.png ADDED
training_quality.png ADDED

Git LFS Details

  • SHA256: db16402ef1b0d8482a3e5fa42e9114f7d23ebcfef146ce4f8efd46729523893c
  • Pointer size: 131 Bytes
  • Size of remote file: 127 kB