Oysiyl commited on
Commit
0ecad4f
·
verified ·
1 Parent(s): 47f7245

Update 27B README + normalized loss curve

Browse files

Refresh model card with completed HF Job metrics (step 1000, train_loss 1.916), add held-out observed output/judgment, and attach normalized training loss vs progress SVG from job 69ca19caf900226fc14aea81.

Files changed (2) hide show
  1. README.md +113 -11
  2. training_loss_vs_progress.svg +32 -0
README.md CHANGED
@@ -1,21 +1,123 @@
1
  ---
 
 
 
2
  base_model: unsloth/Qwen3.5-27B
 
3
  tags:
4
- - text-generation-inference
5
- - transformers
6
  - unsloth
7
  - qwen3_5
8
- license: apache-2.0
9
- language:
10
- - en
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- - **Developed by:** Oysiyl
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/Qwen3.5-27B
 
 
 
18
 
19
- This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
  base_model: unsloth/Qwen3.5-27B
6
+ library_name: transformers
7
  tags:
 
 
8
  - unsloth
9
  - qwen3_5
10
+ - lora
11
+ - rewriting
12
+ - style-transfer
13
+ - unslop
14
+ pipeline_tag: text-generation
15
  ---
16
 
17
+ # qwen3.5-27b-unslop-good-lora-v1
18
+
19
+ A Qwen 3.5 27B fine-tune for unslop rewriting: taking AI-sounding passages and attempting to rewrite them into cleaner, more natural prose while preserving meaning.
20
+
21
+ This run is the first large dense-text Qwen 3.5 follow-up after the earlier pilot family. It is intended as a serious test of whether a newer large text model can match or beat the promising 30B-A3B pilot without depending on the VL backbone path.
22
+
23
+ ## How it was trained
24
+ - Base model: `unsloth/Qwen3.5-27B`
25
+ - Training path: Unsloth fine-tuning on Hugging Face Jobs
26
+ - Dataset: `N8Programs/unslop-good`
27
+ - Rows used: 1000 (full training split)
28
+ - Objective: conversational rewrite / style cleanup
29
+
30
+ ## Training shape
31
+ - hardware: A100 80GB (`a100-large`)
32
+ - max_seq_length: 6144
33
+ - num_train_epochs: 2
34
+ - batch_size: 1
35
+ - gradient_accumulation_steps: 1
36
+ - learning_rate: 1e-4
37
+ - scheduler: cosine
38
+ - warmup_steps: 50
39
+ - LoRA rank: 8
40
+ - LoRA alpha: 20
41
+ - LoRA dropout: 0.0
42
+ - 4-bit loading
43
+ - bf16 training
44
+
45
+ ## Training outcome
46
+ This 27B run completed successfully on Hugging Face Jobs and pushed its adapter repo cleanly.
47
+
48
+ - train_runtime: 10665.33s
49
+ - train_loss: 1.916
50
+ - final step: 1000
51
+
52
+ Operator notes:
53
+ - model load path succeeded on `unsloth/Qwen3.5-27B`
54
+ - dataset formatting and tokenization completed to full planned run length
55
+ - training reached step 1000 and emitted final trainer metrics
56
+ - adapter push completed
57
+ - GGUF export + GGUF upload completed (q2_k, q4_k_m, q6_k, q8_0)
58
+
59
+ ## Intended use
60
+ Use this model as a pipeline stage for:
61
+ - rewriting AI-sounding prose into more natural text
62
+ - testing whether a large dense Qwen 3.5 model can deliver more faithful unslop behavior than the earlier dense pilots
63
+ - comparing a large text-only family directly against the stronger 30B-A3B pilot result
64
+
65
+ ## Limitations
66
+ - still trained on the same small 1000-row dataset
67
+ - training success does not imply fidelity success
68
+ - long-form fidelity still needs stricter, repeated side-by-side judging vs the 30B reference lane
69
+ - this 27B output can become more interventionist than ideal for minimal-touch rewrites
70
+
71
+ ## Training loss vs training progress
72
+
73
+ Normalized curve from the completed 27B run:
74
+
75
+ ![Qwen3.5-27B training loss vs normalized progress](./training_loss_vs_progress.svg)
76
+
77
+ Curve notes:
78
+ - first logged loss: 2.279
79
+ - final logged step loss: 1.864
80
+ - trainer-reported final train_loss: 1.916
81
+ - final step: 1000
82
+
83
+ ## Recommended evaluation sample
84
+ Use this full fiction passage for held-out testing:
85
+
86
+ > The wind rose all at once, as if the mountain itself had inhaled.
87
+ > She stood at the edge of the pass, cloak snapping like a torn banner behind her, staring down at the valley where the last lights were going out one by one. Each flame extinguished felt deliberate—an answer to a question she had been too afraid to ask. Somewhere below, bells began to ring, not in warning, but in mourning. They already knew.
88
+ > “Don’t,” he said, breathless, reaching for her arm. His fingers brushed fabric, then air. “If you cross that line, there’s no coming back.”
89
+ > She didn’t turn. The cold burned her lungs, sharpened her thoughts. “There was never a way back,” she replied softly. “Only a way forward you were willing to pretend didn’t exist.”
90
+ > Thunder rolled, low and ancient. For a moment, the sky fractured into white fire, and in that instant he saw her face—resolved, terrified, incandescent with purpose. Not the girl he had sworn to protect, but the woman the world had demanded she become.
91
+ > “You’ll die,” he said.
92
+ > She smiled then, finally facing him, eyes bright with unshed tears. “So will everyone,” she answered. “The difference is what we leave burning when we’re gone.”
93
+ > She stepped forward.
94
+ > The mountain exhaled.
95
+
96
+ ## Full observed output on that sample
97
+
98
+ Observed output from local eval run:
99
+
100
+ > The wind picked up, sudden and violent, as if the mountain had drawn a sharp breath. She stood at the lip of the pass, her cloak whipping behind her like a flag of surrender, looking down into the valley where the last lights were flickering out one by one. Not blown out. Extinguished. As if someone had reached down and snuffed them. Below, the bells started to ring—not the clear peal of morning, but the slow, heavy toll of evening. They already knew. “Don’t,” he said, his voice barely audible over the rising gale. He reached for her, but his hand passed through the air where her shoulder should have been. “If you step over that ridge, you’re dead.” She didn’t look back. The cold bit into her cheeks, but her eyes were steady. “I’ve been dead since I left you,” she said quietly. “I’m just finishing the job now.��� Thunder cracked overhead, close enough to shake the ground beneath their feet. For a second, lightning split the sky, illuminating her face—a mask of resolve carved from grief and fury. Not the girl he’d known, but something harder, sharper, forged in fire. “You’ll kill yourself,” he shouted. She turned then, meeting his gaze with a smile that didn’t reach her eyes. “Maybe,” she said. “But I’ll take some of them with me.” And with that, she stepped forward. The mountain roared.
101
+
102
+ ## Judgment
103
+
104
+ Blunt judgment: this 27B run is operationally successful and qualitatively strong, but not yet a strict-fidelity winner.
105
+
106
+ Why:
107
+ - output is fluent and coherent end-to-end
108
+ - it preserves core scene structure and emotional trajectory
109
+ - it still injects extra aggression/drama in places (for example, more forceful phrasing and added intent)
110
+ - that means the rewrite can be too interventionist for strict minimal-touch unslop behavior
111
+
112
+ ## Comparison vs pilot series
113
 
114
+ - **0.6B**: failed badly; became a different story
115
+ - **1.7B**: more fluent than 0.6B, but still invented scenes and structure
116
+ - **4B**: first clearly improved text-only model in the series; mostly kept the scene intact, but still drifted and over-shaped the prose
117
+ - **9B (retrained)**: strong practical baseline in the new family; often cleaner and more controlled on medium-length rewrites
118
+ - **30B-A3B VL Instruct**: still the safest fidelity-first reference on long-form passages
119
+ - **Qwen3.5 27B (this run)**: stronger language quality than smaller dense lanes, but currently trends more stylistic/aggressive than ideal in strict-preservation mode
120
 
121
+ ## Conclusion
122
 
123
+ This repo is now a completed post-run artifact for Qwen 3.5 27B with real training metrics, normalized loss curve, and held-out sample output. The core result: 27B clearly works and writes well, but current behavior still needs tighter rewrite constraints before it can replace the 30B reference lane for fidelity-sensitive unslop. It is a serious positive experiment, not a final default.
training_loss_vs_progress.svg ADDED