maru-miya commited on
Commit
5053014
·
verified ·
1 Parent(s): 380025b

Add complete README with correct training hyperparams and upload model

Browse files
Files changed (1) hide show
  1. README.md +22 -6
README.md CHANGED
@@ -40,14 +40,30 @@ Loss is applied to **all assistant turns** in the multi-turn trajectory,
40
  enabling the model to learn environment observation, action selection,
41
  tool use, and recovery from errors.
42
 
43
- ## Training Configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
- - Base model: Qwen3-4B-Instruct-2507
46
- - Method: MergeKit (DARE-TIES) combining two LoRA adapters
 
47
  - Max sequence length: 2048
48
- - Epochs: 1
49
- - Learning rate: 2e-6
50
- - LoRA: r=64, alpha=128
51
 
52
  ## Usage
53
 
 
40
  enabling the model to learn environment observation, action selection,
41
  tool use, and recovery from errors.
42
 
43
+ ## Training & Merge Configuration
44
+
45
+ **[Merge Settings (MergeKit)]**
46
+ - Method: DARE-TIES
47
+ - Base model for merge: Qwen/Qwen3-4B-Instruct-2507
48
+ - Models & Parameters:
49
+ - `maru-miya/lora_agentbench_qwen3_4b_d20_t1` (weight: 1.0, density: 0.7)
50
+ - `maru-miya/lora_agentbench_qwen3_4b_d21_t9_db` (weight: 1.2, density: 0.7)
51
+
52
+ **[Original LoRA Adapter 1: d20_t1]**
53
+ - Base model: Qwen/Qwen3-4B-Instruct-2507
54
+ - Method: LoRA (full precision base)
55
+ - Max sequence length: 2048
56
+ - Epochs: 2
57
+ - Learning rate: 2e-06
58
+ - LoRA: r=16, alpha=32
59
 
60
+ **[Original LoRA Adapter 2: d21_t9_db]**
61
+ - Base model: Qwen/Qwen3-4B-Instruct-2507
62
+ - Method: LoRA (full precision base)
63
  - Max sequence length: 2048
64
+ - Epochs: 2
65
+ - Learning rate: 2e-05
66
+ - LoRA: r=16, alpha=32
67
 
68
  ## Usage
69