Instructions to use Umiharu/Qwen-4B-DB-AlfWorld-v9 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Umiharu/Qwen-4B-DB-AlfWorld-v9 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Umiharu/Qwen-4B-DB-AlfWorld-v9") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Umiharu/Qwen-4B-DB-AlfWorld-v9") model = AutoModelForMultimodalLM.from_pretrained("Umiharu/Qwen-4B-DB-AlfWorld-v9") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Umiharu/Qwen-4B-DB-AlfWorld-v9 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Umiharu/Qwen-4B-DB-AlfWorld-v9" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Umiharu/Qwen-4B-DB-AlfWorld-v9", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Umiharu/Qwen-4B-DB-AlfWorld-v9
- SGLang
How to use Umiharu/Qwen-4B-DB-AlfWorld-v9 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Umiharu/Qwen-4B-DB-AlfWorld-v9" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Umiharu/Qwen-4B-DB-AlfWorld-v9", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Umiharu/Qwen-4B-DB-AlfWorld-v9" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Umiharu/Qwen-4B-DB-AlfWorld-v9", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Umiharu/Qwen-4B-DB-AlfWorld-v9 with Docker Model Runner:
docker model run hf.co/Umiharu/Qwen-4B-DB-AlfWorld-v9
Qwen-4B-DB-AlfWorld-v9
This repository provides a merged model fine-tuned from Qwen/Qwen3-4B-Instruct-2507 on datasets u-10bei/sft_alfworld_trajectory_dataset_v5, u-10bei/sft_alfworld_trajectory_dataset_v4,dbbench_sft_dataset_react_v2 and dbbench_sft_dataset_react_v3.
All LoRA adapter weights have been merged into the base model, and the
resulting merged model is saved here as a standalone model.
No external adapter loading is required.
Dataset Notes (IMPORTANT)
For u-10bei/sft_alfworld_trajectory_dataset_v5,
only samples with:
- input length ≤ 2048 tokens,
- trajectory_outcome == "success",
- num_steps ≤ 35
were included in the training set.
These filters were applied to ensure training stability, reduce noisy or failed trajectories, and maintain consistency with the maximum sequence length used during training.
Training Objective
This model is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).
Loss is applied to all assistant turns in multi-turn trajectories, enabling the model to learn environment observation, step-by-step reasoning, action execution, tool use, and recovery from errors.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (merged into final weights)
- Max sequence length: 2048
- Learning rate: 1e-05
- LoRA parameters used during training: r=8, alpha=16
Usage (Agent-style Inference Example)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Umiharu/Qwen-4B-DB-AlfWorld-v9"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
)
prompt = "You are a household task-solving agent. Respond 'OK' if you are ready."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=64,
temperature=0.2,
do_sample=False,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Sources & Terms (IMPORTANT)
Training data: u-10bei/sft_alfworld_trajectory_dataset_v5, u-10bei/sft_alfworld_trajectory_dataset_v4,dbbench_sft_dataset_react_v2 and dbbench_sft_dataset_react_v3
Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License. Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
- Downloads last month
- 11
Model tree for Umiharu/Qwen-4B-DB-AlfWorld-v9
Base model
Qwen/Qwen3-4B-Instruct-2507