maru-miya
/

merged_agentbench_qwen3_4b_dare_ties_t1

@@ -40,14 +40,30 @@ Loss is applied to **all assistant turns** in the multi-turn trajectory,
 enabling the model to learn environment observation, action selection,
 tool use, and recovery from errors.
-## Training Configuration
-- Base model: Qwen3-4B-Instruct-2507
-- Method: MergeKit (DARE-TIES) combining two LoRA adapters
 - Max sequence length: 2048
-- Epochs: 1
-- Learning rate: 2e-6
-- LoRA: r=64, alpha=128
 ## Usage

 enabling the model to learn environment observation, action selection,
 tool use, and recovery from errors.
+## Training & Merge Configuration
+**[Merge Settings (MergeKit)]**
+- Method: DARE-TIES
+- Base model for merge: Qwen/Qwen3-4B-Instruct-2507
+- Models & Parameters:
+  - `maru-miya/lora_agentbench_qwen3_4b_d20_t1` (weight: 1.0, density: 0.7)
+  - `maru-miya/lora_agentbench_qwen3_4b_d21_t9_db` (weight: 1.2, density: 0.7)
+**[Original LoRA Adapter 1: d20_t1]**
+- Base model: Qwen/Qwen3-4B-Instruct-2507
+- Method: LoRA (full precision base)
+- Max sequence length: 2048
+- Epochs: 2
+- Learning rate: 2e-06
+- LoRA: r=16, alpha=32
+**[Original LoRA Adapter 2: d21_t9_db]**
+- Base model: Qwen/Qwen3-4B-Instruct-2507
+- Method: LoRA (full precision base)
 - Max sequence length: 2048
+- Epochs: 2
+- Learning rate: 2e-05
+- LoRA: r=16, alpha=32
 ## Usage