Update 27B README + normalized loss curve
Browse filesRefresh model card with completed HF Job metrics (step 1000, train_loss 1.916), add held-out observed output/judgment, and attach normalized training loss vs progress SVG from job 69ca19caf900226fc14aea81.
- README.md +113 -11
- training_loss_vs_progress.svg +32 -0
README.md
CHANGED
|
@@ -1,21 +1,123 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model: unsloth/Qwen3.5-27B
|
|
|
|
| 3 |
tags:
|
| 4 |
-
- text-generation-inference
|
| 5 |
-
- transformers
|
| 6 |
- unsloth
|
| 7 |
- qwen3_5
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
-
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
-
- **
|
| 16 |
-
- **
|
| 17 |
-
- **
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: apache-2.0
|
| 5 |
base_model: unsloth/Qwen3.5-27B
|
| 6 |
+
library_name: transformers
|
| 7 |
tags:
|
|
|
|
|
|
|
| 8 |
- unsloth
|
| 9 |
- qwen3_5
|
| 10 |
+
- lora
|
| 11 |
+
- rewriting
|
| 12 |
+
- style-transfer
|
| 13 |
+
- unslop
|
| 14 |
+
pipeline_tag: text-generation
|
| 15 |
---
|
| 16 |
|
| 17 |
+
# qwen3.5-27b-unslop-good-lora-v1
|
| 18 |
+
|
| 19 |
+
A Qwen 3.5 27B fine-tune for unslop rewriting: taking AI-sounding passages and attempting to rewrite them into cleaner, more natural prose while preserving meaning.
|
| 20 |
+
|
| 21 |
+
This run is the first large dense-text Qwen 3.5 follow-up after the earlier pilot family. It is intended as a serious test of whether a newer large text model can match or beat the promising 30B-A3B pilot without depending on the VL backbone path.
|
| 22 |
+
|
| 23 |
+
## How it was trained
|
| 24 |
+
- Base model: `unsloth/Qwen3.5-27B`
|
| 25 |
+
- Training path: Unsloth fine-tuning on Hugging Face Jobs
|
| 26 |
+
- Dataset: `N8Programs/unslop-good`
|
| 27 |
+
- Rows used: 1000 (full training split)
|
| 28 |
+
- Objective: conversational rewrite / style cleanup
|
| 29 |
+
|
| 30 |
+
## Training shape
|
| 31 |
+
- hardware: A100 80GB (`a100-large`)
|
| 32 |
+
- max_seq_length: 6144
|
| 33 |
+
- num_train_epochs: 2
|
| 34 |
+
- batch_size: 1
|
| 35 |
+
- gradient_accumulation_steps: 1
|
| 36 |
+
- learning_rate: 1e-4
|
| 37 |
+
- scheduler: cosine
|
| 38 |
+
- warmup_steps: 50
|
| 39 |
+
- LoRA rank: 8
|
| 40 |
+
- LoRA alpha: 20
|
| 41 |
+
- LoRA dropout: 0.0
|
| 42 |
+
- 4-bit loading
|
| 43 |
+
- bf16 training
|
| 44 |
+
|
| 45 |
+
## Training outcome
|
| 46 |
+
This 27B run completed successfully on Hugging Face Jobs and pushed its adapter repo cleanly.
|
| 47 |
+
|
| 48 |
+
- train_runtime: 10665.33s
|
| 49 |
+
- train_loss: 1.916
|
| 50 |
+
- final step: 1000
|
| 51 |
+
|
| 52 |
+
Operator notes:
|
| 53 |
+
- model load path succeeded on `unsloth/Qwen3.5-27B`
|
| 54 |
+
- dataset formatting and tokenization completed to full planned run length
|
| 55 |
+
- training reached step 1000 and emitted final trainer metrics
|
| 56 |
+
- adapter push completed
|
| 57 |
+
- GGUF export + GGUF upload completed (q2_k, q4_k_m, q6_k, q8_0)
|
| 58 |
+
|
| 59 |
+
## Intended use
|
| 60 |
+
Use this model as a pipeline stage for:
|
| 61 |
+
- rewriting AI-sounding prose into more natural text
|
| 62 |
+
- testing whether a large dense Qwen 3.5 model can deliver more faithful unslop behavior than the earlier dense pilots
|
| 63 |
+
- comparing a large text-only family directly against the stronger 30B-A3B pilot result
|
| 64 |
+
|
| 65 |
+
## Limitations
|
| 66 |
+
- still trained on the same small 1000-row dataset
|
| 67 |
+
- training success does not imply fidelity success
|
| 68 |
+
- long-form fidelity still needs stricter, repeated side-by-side judging vs the 30B reference lane
|
| 69 |
+
- this 27B output can become more interventionist than ideal for minimal-touch rewrites
|
| 70 |
+
|
| 71 |
+
## Training loss vs training progress
|
| 72 |
+
|
| 73 |
+
Normalized curve from the completed 27B run:
|
| 74 |
+
|
| 75 |
+

|
| 76 |
+
|
| 77 |
+
Curve notes:
|
| 78 |
+
- first logged loss: 2.279
|
| 79 |
+
- final logged step loss: 1.864
|
| 80 |
+
- trainer-reported final train_loss: 1.916
|
| 81 |
+
- final step: 1000
|
| 82 |
+
|
| 83 |
+
## Recommended evaluation sample
|
| 84 |
+
Use this full fiction passage for held-out testing:
|
| 85 |
+
|
| 86 |
+
> The wind rose all at once, as if the mountain itself had inhaled.
|
| 87 |
+
> She stood at the edge of the pass, cloak snapping like a torn banner behind her, staring down at the valley where the last lights were going out one by one. Each flame extinguished felt deliberate—an answer to a question she had been too afraid to ask. Somewhere below, bells began to ring, not in warning, but in mourning. They already knew.
|
| 88 |
+
> “Don’t,” he said, breathless, reaching for her arm. His fingers brushed fabric, then air. “If you cross that line, there’s no coming back.”
|
| 89 |
+
> She didn’t turn. The cold burned her lungs, sharpened her thoughts. “There was never a way back,” she replied softly. “Only a way forward you were willing to pretend didn’t exist.”
|
| 90 |
+
> Thunder rolled, low and ancient. For a moment, the sky fractured into white fire, and in that instant he saw her face—resolved, terrified, incandescent with purpose. Not the girl he had sworn to protect, but the woman the world had demanded she become.
|
| 91 |
+
> “You’ll die,” he said.
|
| 92 |
+
> She smiled then, finally facing him, eyes bright with unshed tears. “So will everyone,” she answered. “The difference is what we leave burning when we’re gone.”
|
| 93 |
+
> She stepped forward.
|
| 94 |
+
> The mountain exhaled.
|
| 95 |
+
|
| 96 |
+
## Full observed output on that sample
|
| 97 |
+
|
| 98 |
+
Observed output from local eval run:
|
| 99 |
+
|
| 100 |
+
> The wind picked up, sudden and violent, as if the mountain had drawn a sharp breath. She stood at the lip of the pass, her cloak whipping behind her like a flag of surrender, looking down into the valley where the last lights were flickering out one by one. Not blown out. Extinguished. As if someone had reached down and snuffed them. Below, the bells started to ring—not the clear peal of morning, but the slow, heavy toll of evening. They already knew. “Don’t,” he said, his voice barely audible over the rising gale. He reached for her, but his hand passed through the air where her shoulder should have been. “If you step over that ridge, you’re dead.” She didn’t look back. The cold bit into her cheeks, but her eyes were steady. “I’ve been dead since I left you,” she said quietly. “I’m just finishing the job now.��� Thunder cracked overhead, close enough to shake the ground beneath their feet. For a second, lightning split the sky, illuminating her face—a mask of resolve carved from grief and fury. Not the girl he’d known, but something harder, sharper, forged in fire. “You’ll kill yourself,” he shouted. She turned then, meeting his gaze with a smile that didn’t reach her eyes. “Maybe,” she said. “But I’ll take some of them with me.” And with that, she stepped forward. The mountain roared.
|
| 101 |
+
|
| 102 |
+
## Judgment
|
| 103 |
+
|
| 104 |
+
Blunt judgment: this 27B run is operationally successful and qualitatively strong, but not yet a strict-fidelity winner.
|
| 105 |
+
|
| 106 |
+
Why:
|
| 107 |
+
- output is fluent and coherent end-to-end
|
| 108 |
+
- it preserves core scene structure and emotional trajectory
|
| 109 |
+
- it still injects extra aggression/drama in places (for example, more forceful phrasing and added intent)
|
| 110 |
+
- that means the rewrite can be too interventionist for strict minimal-touch unslop behavior
|
| 111 |
+
|
| 112 |
+
## Comparison vs pilot series
|
| 113 |
|
| 114 |
+
- **0.6B**: failed badly; became a different story
|
| 115 |
+
- **1.7B**: more fluent than 0.6B, but still invented scenes and structure
|
| 116 |
+
- **4B**: first clearly improved text-only model in the series; mostly kept the scene intact, but still drifted and over-shaped the prose
|
| 117 |
+
- **9B (retrained)**: strong practical baseline in the new family; often cleaner and more controlled on medium-length rewrites
|
| 118 |
+
- **30B-A3B VL Instruct**: still the safest fidelity-first reference on long-form passages
|
| 119 |
+
- **Qwen3.5 27B (this run)**: stronger language quality than smaller dense lanes, but currently trends more stylistic/aggressive than ideal in strict-preservation mode
|
| 120 |
|
| 121 |
+
## Conclusion
|
| 122 |
|
| 123 |
+
This repo is now a completed post-run artifact for Qwen 3.5 27B with real training metrics, normalized loss curve, and held-out sample output. The core result: 27B clearly works and writes well, but current behavior still needs tighter rewrite constraints before it can replace the 30B reference lane for fidelity-sensitive unslop. It is a serious positive experiment, not a final default.
|
training_loss_vs_progress.svg
ADDED
|
|