Qwen3-1.7B Backward LoRA Adapter

This repository contains the LoRA adapter trained for Assignment 3, Part 1: a backward model p(x|y) that predicts the original user instruction from an assistant response.

Model Details

  • Base model: Qwen/Qwen3-1.7B
  • Adapter type: LoRA (peft)
  • Task: backward instruction generation (response -> instruction)
  • Training objective: given an assistant response, generate the user instruction that most likely produced it
  • Finetuned from: Qwen/Qwen3-1.7B

Intended Use

This adapter is intended for coursework reproduction of the paper Self-Alignment with Instruction Backtranslation. It is designed to be used as the backward model in the assignment pipeline:

  1. Train p(x|y) on seed instruction-response data.
  2. Use the backward model to infer instructions from LIMA responses.
  3. Curate synthetic pairs.
  4. Train the final forward instruction-following model.

This model is not intended as a general-purpose assistant.

Training Data

Training data came from timdettmers/openassistant-guanaco.

Because this dataset stores conversations in a single text field, preprocessing extracted all complete:

  • ### Human: ...
  • ### Assistant: ...

pairs from each conversation and converted them into backward samples:

  • input / condition: assistant response y
  • target: user instruction x

Final dataset sizes:

  • Usable backward examples: 13,436
  • Train split: 13,180
  • Eval split: 256

Training Procedure

Prompt Format

Each training sample used a causal LM prompt with masked loss on the instruction only:

System: You are a reverse instruction generation model.

Given the following assistant response, generate the original user instruction that most likely produced it.

Assistant response:
<response>

Recovered user instruction:
<instruction>

The loss is applied only to the <instruction> portion.

Hyperparameters

  • Precision: bf16
  • Epochs: 2
  • Max sequence length: 1024
  • Learning rate: 1e-4
  • Per-device train batch size: 4
  • Per-device eval batch size: 4
  • Gradient accumulation steps: 4
  • Warmup ratio: 0.03
  • Weight decay: 0.0
  • Gradient checkpointing: true

LoRA Configuration

  • r: 16
  • alpha: 32
  • dropout: 0.05
  • target modules: q_proj, k_proj, v_proj, o_proj

Results

Training completed successfully.

  • Total train steps: 1648
  • Final train loss: 1.7289
  • Best observed eval loss region: about 1.6770 - 1.6896
  • Runtime: about 57.8 minutes

The model learns the backward mapping and is suitable for generating candidate instructions for the next stage of the assignment pipeline.

Limitations

  • Output quality depends heavily on the assistant response quality and prompt format.
  • The model may produce verbose or partially repeated instructions on some inputs.
  • This adapter is optimized for coursework reproduction rather than production deployment.

How to Load

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "Qwen/Qwen3-1.7B"
adapter_path = "path/to/this/adapter"

tokenizer = AutoTokenizer.from_pretrained(adapter_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    trust_remote_code=True,
    dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_path)
model.eval()

Example Prompt

System: You are a reverse instruction generation model.

Given the following assistant response, generate the original user instruction that most likely produced it.

Assistant response:
<assistant response here>

Recovered user instruction:

License

Use of this adapter should follow the license and usage terms of the base model Qwen/Qwen3-1.7B and the underlying training data.

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sunming-giegie/assignment3-qwen3-1.7b-backward-lora

Finetuned
Qwen/Qwen3-1.7B
Adapter
(438)
this model