Jackrong commited on
Commit
7c9b3a9
·
verified ·
1 Parent(s): c485a9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -16,7 +16,31 @@ datasets:
16
  - nohurry/Opus-4.6-Reasoning-3000x-filtered
17
  - Jackrong/Qwen3.5-reasoning-700x
18
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
 
 
 
 
 
 
20
  # 🌟 Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
21
 
22
  > **Build Environment Upgrades:**
@@ -53,6 +77,8 @@ Let me analyze this request carefully:
53
  .
54
  ```
55
 
 
 
56
  ## 🗺️ Training Pipeline Overview
57
 
58
  ```text
@@ -73,6 +99,7 @@ Final Model (Claude-4.6-Opus-Reasoning-Distilled,text-only)
73
 
74
  > **From the test results, it is clear that different Qwen3.5 quantized models show significant differences in tool-calling capability. Among them, only the 27B model distilled with Claude Opus reasoning demonstrates stable performance.**
75
 
 
76
 
77
  🔥**Community-tested advantages** (benchmark tests by user @sudoing on a single RTX 3090):
78
 
@@ -91,6 +118,7 @@ Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled shows significant advantages in
91
 
92
  **Thanks to the community for the in-depth testing and feedback!**
93
 
 
94
 
95
  ### 🔹 Supervised Fine-Tuning (SFT)
96
  - **Objective:** To inject high-density reasoning logic and establish a strict format for problem-solving involving an internal thinking state prior to outputting the final response.
@@ -103,7 +131,6 @@ The dataset consists of high-quality, filtered reasoning distillation data:
103
  | Dataset Name | Description / Purpose |
104
  |--------------|-----------------------|
105
  | [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | Provides comprehensive Claude 4.6 Opus reasoning trajectories. |
106
- | [TeichAI/claude-4.5-opus-high-reasoning-250x](https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x) | Injecting high-intensity, structured reasoning instances. |
107
  | [Jackrong/Qwen3.5-reasoning-700x](https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x) | Additional curated reasoning samples designed to strengthen structured step-by-step problem solving and improve reasoning diversity. |
108
 
109
  ## 🌟 Core Skills & Capabilities
 
16
  - nohurry/Opus-4.6-Reasoning-3000x-filtered
17
  - Jackrong/Qwen3.5-reasoning-700x
18
  ---
19
+ # 🌟 Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
20
+
21
+ 🔥 **Update (April 5): To help beginners and enthusiasts better understand and reproduce the fine-tuning process of this model, I have prepared the complete training notebook, codebase, and a comprehensive companion PDF guide! Please check the resource links below.**
22
+
23
+ > ❤️ Special thanks to the Unsloth open-source library and @KyleHessling1 for their support.
24
+
25
+ ## 📚 Resources & Guides
26
+
27
+ If you want to dive into how this model was trained, or wish to reproduce the results locally or on Colab, please visit my GitHub repository:
28
+ 👉 **[Jackrong-llm-finetuning-guide](https://github.com/R6410418/Jackrong-llm-finetuning-guide)**
29
+
30
+ ### 📥 Core Technical Document Direct Download
31
+ You can click the link below to directly access the complete technical manual for the Qwopus3.5 training:
32
+
33
+ * **[Qwopus3-5-27b-Colab_complete_guide_to_llm_finetuning.pdf](https://github.com/R6410418/Jackrong-llm-finetuning-guide/raw/main/Qwopus3-5-27b-Colab_complete_guide_to_llm_finetuning.pdf)**
34
+ * Covers the entire workflow, starting with an introduction to Google Colab and Unsloth.
35
+ * Details the complete pipeline with step-by-step explanations—from downloading the base model and normalizing heterogeneous data sources into a unified format, to configuring trainer hyperparameters and finally publishing to Hugging Face.
36
+ * Feedback is highly welcome! If you spot any shortcomings or areas for improvement, please let me know, and I will update it promptly.
37
 
38
+ > **A Note:**
39
+ > My goal in writing this guide goes beyond merely detailing a single training workflow. I want to convey a broader message: fine-tuning, post-training, and even medium-scale pre-training are not unattainable technical rituals, nor are they the exaggerated hype often packaged by social media. More often than not, all you need is a Google account, a standard laptop, and relentless curiosity.
40
+ >
41
+ > *No one starts as an expert. But every expert was once brave enough to begin.*
42
+
43
+ ---
44
  # 🌟 Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
45
 
46
  > **Build Environment Upgrades:**
 
77
  .
78
  ```
79
 
80
+ ---
81
+
82
  ## 🗺️ Training Pipeline Overview
83
 
84
  ```text
 
99
 
100
  > **From the test results, it is clear that different Qwen3.5 quantized models show significant differences in tool-calling capability. Among them, only the 27B model distilled with Claude Opus reasoning demonstrates stable performance.**
101
 
102
+ ---
103
 
104
  🔥**Community-tested advantages** (benchmark tests by user @sudoing on a single RTX 3090):
105
 
 
118
 
119
  **Thanks to the community for the in-depth testing and feedback!**
120
 
121
+ ---
122
 
123
  ### 🔹 Supervised Fine-Tuning (SFT)
124
  - **Objective:** To inject high-density reasoning logic and establish a strict format for problem-solving involving an internal thinking state prior to outputting the final response.
 
131
  | Dataset Name | Description / Purpose |
132
  |--------------|-----------------------|
133
  | [nohurry/Opus-4.6-Reasoning-3000x-filtered](https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered) | Provides comprehensive Claude 4.6 Opus reasoning trajectories. |
 
134
  | [Jackrong/Qwen3.5-reasoning-700x](https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x) | Additional curated reasoning samples designed to strengthen structured step-by-step problem solving and improve reasoning diversity. |
135
 
136
  ## 🌟 Core Skills & Capabilities