Add complete README with correct training hyperparams and upload model
Browse files
README.md
CHANGED
|
@@ -40,14 +40,30 @@ Loss is applied to **all assistant turns** in the multi-turn trajectory,
|
|
| 40 |
enabling the model to learn environment observation, action selection,
|
| 41 |
tool use, and recovery from errors.
|
| 42 |
|
| 43 |
-
## Training Configuration
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
-
|
| 46 |
-
-
|
|
|
|
| 47 |
- Max sequence length: 2048
|
| 48 |
-
- Epochs:
|
| 49 |
-
- Learning rate: 2e-
|
| 50 |
-
- LoRA: r=
|
| 51 |
|
| 52 |
## Usage
|
| 53 |
|
|
|
|
| 40 |
enabling the model to learn environment observation, action selection,
|
| 41 |
tool use, and recovery from errors.
|
| 42 |
|
| 43 |
+
## Training & Merge Configuration
|
| 44 |
+
|
| 45 |
+
**[Merge Settings (MergeKit)]**
|
| 46 |
+
- Method: DARE-TIES
|
| 47 |
+
- Base model for merge: Qwen/Qwen3-4B-Instruct-2507
|
| 48 |
+
- Models & Parameters:
|
| 49 |
+
- `maru-miya/lora_agentbench_qwen3_4b_d20_t1` (weight: 1.0, density: 0.7)
|
| 50 |
+
- `maru-miya/lora_agentbench_qwen3_4b_d21_t9_db` (weight: 1.2, density: 0.7)
|
| 51 |
+
|
| 52 |
+
**[Original LoRA Adapter 1: d20_t1]**
|
| 53 |
+
- Base model: Qwen/Qwen3-4B-Instruct-2507
|
| 54 |
+
- Method: LoRA (full precision base)
|
| 55 |
+
- Max sequence length: 2048
|
| 56 |
+
- Epochs: 2
|
| 57 |
+
- Learning rate: 2e-06
|
| 58 |
+
- LoRA: r=16, alpha=32
|
| 59 |
|
| 60 |
+
**[Original LoRA Adapter 2: d21_t9_db]**
|
| 61 |
+
- Base model: Qwen/Qwen3-4B-Instruct-2507
|
| 62 |
+
- Method: LoRA (full precision base)
|
| 63 |
- Max sequence length: 2048
|
| 64 |
+
- Epochs: 2
|
| 65 |
+
- Learning rate: 2e-05
|
| 66 |
+
- LoRA: r=16, alpha=32
|
| 67 |
|
| 68 |
## Usage
|
| 69 |
|