Qwen3-1.7B Backward LoRA Adapter
This repository contains the LoRA adapter trained for Assignment 3, Part 1: a backward model p(x|y) that predicts the original user instruction from an assistant response.
Model Details
- Base model:
Qwen/Qwen3-1.7B - Adapter type: LoRA (
peft) - Task: backward instruction generation (
response -> instruction) - Training objective: given an assistant response, generate the user instruction that most likely produced it
- Finetuned from:
Qwen/Qwen3-1.7B
Intended Use
This adapter is intended for coursework reproduction of the paper Self-Alignment with Instruction Backtranslation. It is designed to be used as the backward model in the assignment pipeline:
- Train
p(x|y)on seed instruction-response data. - Use the backward model to infer instructions from LIMA responses.
- Curate synthetic pairs.
- Train the final forward instruction-following model.
This model is not intended as a general-purpose assistant.
Training Data
Training data came from timdettmers/openassistant-guanaco.
Because this dataset stores conversations in a single text field, preprocessing extracted all complete:
### Human: ...### Assistant: ...
pairs from each conversation and converted them into backward samples:
- input / condition: assistant response
y - target: user instruction
x
Final dataset sizes:
- Usable backward examples:
13,436 - Train split:
13,180 - Eval split:
256
Training Procedure
Prompt Format
Each training sample used a causal LM prompt with masked loss on the instruction only:
System: You are a reverse instruction generation model.
Given the following assistant response, generate the original user instruction that most likely produced it.
Assistant response:
<response>
Recovered user instruction:
<instruction>
The loss is applied only to the <instruction> portion.
Hyperparameters
- Precision:
bf16 - Epochs:
2 - Max sequence length:
1024 - Learning rate:
1e-4 - Per-device train batch size:
4 - Per-device eval batch size:
4 - Gradient accumulation steps:
4 - Warmup ratio:
0.03 - Weight decay:
0.0 - Gradient checkpointing:
true
LoRA Configuration
- r:
16 - alpha:
32 - dropout:
0.05 - target modules:
q_proj,k_proj,v_proj,o_proj
Results
Training completed successfully.
- Total train steps:
1648 - Final train loss:
1.7289 - Best observed eval loss region: about
1.6770 - 1.6896 - Runtime: about
57.8 minutes
The model learns the backward mapping and is suitable for generating candidate instructions for the next stage of the assignment pipeline.
Limitations
- Output quality depends heavily on the assistant response quality and prompt format.
- The model may produce verbose or partially repeated instructions on some inputs.
- This adapter is optimized for coursework reproduction rather than production deployment.
How to Load
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = "Qwen/Qwen3-1.7B"
adapter_path = "path/to/this/adapter"
tokenizer = AutoTokenizer.from_pretrained(adapter_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model,
trust_remote_code=True,
dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_path)
model.eval()
Example Prompt
System: You are a reverse instruction generation model.
Given the following assistant response, generate the original user instruction that most likely produced it.
Assistant response:
<assistant response here>
Recovered user instruction:
License
Use of this adapter should follow the license and usage terms of the base model Qwen/Qwen3-1.7B and the underlying training data.
- Downloads last month
- 9