qwen3-4b-agent-trajectory-lora (Merged Model)
This repository provides a LoRA adapter fine-tuned from Qwen3-4B-Instruct-2507 using LoRA + Unsloth.
[Merge Information] This model is a merged adapter created using MergeKit (DARE-TIES) to combine the strengths of the following two models:
maru-miya/lora_agentbench_qwen3_4b_d20_t1(General Agent focus)maru-miya/lora_agentbench_qwen3_4b_d21_t9_db(DB specialized focus)
This repository contains LoRA adapter weights only. The base model must be loaded separately.
Training Objective
This adapter is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).
Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.
Training & Merge Configuration
[Merge Settings (MergeKit)]
- Method: DARE-TIES
- Base model for merge: Qwen/Qwen3-4B-Instruct-2507
- Models & Parameters:
maru-miya/lora_agentbench_qwen3_4b_d20_t1(weight: 1.0, density: 0.7)maru-miya/lora_agentbench_qwen3_4b_d21_t9_db(weight: 1.2, density: 0.7)
[Original LoRA Adapter 1: d20_t1]
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision base)
- Max sequence length: 2048
- Epochs: 2
- Learning rate: 2e-06
- LoRA: r=16, alpha=32
[Original LoRA Adapter 2: d21_t9_db]
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision base)
- Max sequence length: 2048
- Epochs: 2
- Learning rate: 2e-05
- LoRA: r=16, alpha=32
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "maru-miya/merged_agentbench_qwen3_4b_dare_ties_t1"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Sources & Terms (IMPORTANT)
Training data:
u-10bei/sft_alfworld_trajectory_dataset_v5
u-10bei/dbbench_sft_dataset_react
This repository does NOT redistribute the dataset.
Users must comply with the dataset license and base model terms.
- Downloads last month
- -
Model tree for maru-miya/merged_agentbench_qwen3_4b_dare_ties_t1
Base model
Qwen/Qwen3-4B-Instruct-2507