Oysiyl commited on
Commit
8e2b93e
·
verified ·
1 Parent(s): 862b57e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +17 -4
README.md CHANGED
@@ -43,7 +43,7 @@ This run is the first large dense-text Qwen 3.5 follow-up after the earlier pilo
43
  - bf16 training
44
 
45
  ## Training outcome
46
- This 27B run completed successfully on Hugging Face Jobs and pushed its adapter repo cleanly.
47
 
48
  - train_runtime: 10665.33s
49
  - train_loss: 1.916
@@ -53,7 +53,7 @@ Operator notes:
53
  - model load path succeeded on `unsloth/Qwen3.5-27B`
54
  - dataset formatting and tokenization completed to full planned run length
55
  - training reached step 1000 and emitted final trainer metrics
56
- - adapter push completed
57
  - GGUF export + GGUF upload completed (q2_k, q4_k_m, q6_k, q8_0)
58
 
59
  ## Intended use
@@ -72,7 +72,7 @@ Use this model as a pipeline stage for:
72
 
73
  Normalized curve from the completed 27B run:
74
 
75
- ![Qwen3.5-27B training loss vs normalized progress](./training_loss_vs_progress.svg)
76
 
77
  Curve notes:
78
  - first logged loss: 2.279
@@ -80,6 +80,15 @@ Curve notes:
80
  - trainer-reported final train_loss: 1.916
81
  - final step: 1000
82
 
 
 
 
 
 
 
 
 
 
83
  ## Recommended evaluation sample
84
  Use this full fiction passage for held-out testing:
85
 
@@ -95,7 +104,11 @@ Use this full fiction passage for held-out testing:
95
 
96
  ## Full observed output on that sample
97
 
98
- Observed output from local eval run:
 
 
 
 
99
 
100
  > The wind picked up, sudden and violent, as if the mountain had drawn a sharp breath. She stood at the lip of the pass, her cloak whipping behind her like a flag of surrender, looking down into the valley where the last lights were flickering out one by one. Not blown out. Extinguished. As if someone had reached down and snuffed them. Below, the bells started to ring—not the clear peal of morning, but the slow, heavy toll of evening. They already knew. “Don’t,” he said, his voice barely audible over the rising gale. He reached for her, but his hand passed through the air where her shoulder should have been. “If you step over that ridge, you’re dead.” She didn’t look back. The cold bit into her cheeks, but her eyes were steady. “I’ve been dead since I left you,” she said quietly. “I’m just finishing the job now.” Thunder cracked overhead, close enough to shake the ground beneath their feet. For a second, lightning split the sky, illuminating her face—a mask of resolve carved from grief and fury. Not the girl he’d known, but something harder, sharper, forged in fire. “You’ll kill yourself,” he shouted. She turned then, meeting his gaze with a smile that didn’t reach her eyes. “Maybe,” she said. “But I’ll take some of them with me.” And with that, she stepped forward. The mountain roared.
101
 
 
43
  - bf16 training
44
 
45
  ## Training outcome
46
+ This 27B run completed successfully on Hugging Face Jobs and pushed a deployable merged-model repo cleanly.
47
 
48
  - train_runtime: 10665.33s
49
  - train_loss: 1.916
 
53
  - model load path succeeded on `unsloth/Qwen3.5-27B`
54
  - dataset formatting and tokenization completed to full planned run length
55
  - training reached step 1000 and emitted final trainer metrics
56
+ - merged-model push completed (this repo is currently serving as a full-model artifact, not a PEFT adapter-only repo)
57
  - GGUF export + GGUF upload completed (q2_k, q4_k_m, q6_k, q8_0)
58
 
59
  ## Intended use
 
72
 
73
  Normalized curve from the completed 27B run:
74
 
75
+ ![Qwen3.5-27B training loss vs normalized progress](./training_loss_vs_progress.png)
76
 
77
  Curve notes:
78
  - first logged loss: 2.279
 
80
  - trainer-reported final train_loss: 1.916
81
  - final step: 1000
82
 
83
+ Across the successful Qwen 3.5 HF Jobs runs, trainer-reported final `train_loss` values were:
84
+ - 0.8B: 2.504
85
+ - 2B: 2.241
86
+ - 4B: 1.989
87
+ - 9B: 1.854
88
+ - 27B: 1.916
89
+
90
+ ![Qwen3.5 family HF Jobs final train_loss](./training_loss_qwen35_family.png)
91
+
92
  ## Recommended evaluation sample
93
  Use this full fiction passage for held-out testing:
94
 
 
104
 
105
  ## Full observed output on that sample
106
 
107
+ Live Modal short sanity sample observed output:
108
+
109
+ > This tool saves teams hours every week, yet the copy still sounds mechanical and bland.
110
+
111
+ Observed output from held-out eval run:
112
 
113
  > The wind picked up, sudden and violent, as if the mountain had drawn a sharp breath. She stood at the lip of the pass, her cloak whipping behind her like a flag of surrender, looking down into the valley where the last lights were flickering out one by one. Not blown out. Extinguished. As if someone had reached down and snuffed them. Below, the bells started to ring—not the clear peal of morning, but the slow, heavy toll of evening. They already knew. “Don’t,” he said, his voice barely audible over the rising gale. He reached for her, but his hand passed through the air where her shoulder should have been. “If you step over that ridge, you’re dead.” She didn’t look back. The cold bit into her cheeks, but her eyes were steady. “I’ve been dead since I left you,” she said quietly. “I’m just finishing the job now.” Thunder cracked overhead, close enough to shake the ground beneath their feet. For a second, lightning split the sky, illuminating her face—a mask of resolve carved from grief and fury. Not the girl he’d known, but something harder, sharper, forged in fire. “You’ll kill yourself,” he shouted. She turned then, meeting his gaze with a smile that didn’t reach her eyes. “Maybe,” she said. “But I’ll take some of them with me.” And with that, she stepped forward. The mountain roared.
114