SecureX-HR: Qwen3-7B RAFT fine-tuned on TechQA

QLoRA fine-tuned Qwen/Qwen3-8B using RAFT (arXiv:2403.10131) for the SecureX AI enterprise Technical RAG pipeline.

Training details

  • Method: QLoRA 4-bit NF4 + RAFT + Unsloth
  • Dataset: rungalileo/ragbench (cuad subset)
  • Hardware: Kaggle T4 x2 (Unsloth single GPU mode)
  • LoRA rank: 16 | Alpha: 32
  • Samples: 475 | Epochs: 3
  • Thinking mode: OFF at inference

Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name   = "Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft",
    max_seq_length = 2048,
    dtype          = None,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

msgs = [
    {"role": "system", "content": "You are SecureX-HR..."},
    {"role": "user",   "content": "Question: ...\n\nRetrieved Documents:\n..."},
]
inputs = tokenizer.apply_chat_template(msgs, tokenize=True,
          return_tensors="pt", enable_thinking=False).to("cuda")
out = model.generate(inputs, max_new_tokens=512, temperature=0.1)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Citation

@misc{zhang2024raft,
  title={RAFT: Adapting Language Model to Domain Specific RAG},
  author={Tianhao Zhang et al.},
  year={2024},
  eprint={2403.10131},
  archivePrefix={arXiv}
}
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Finetuned
Qwen/Qwen3-8B
Adapter
(1461)
this model

Dataset used to train Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft

Paper for Niraj-P-Chaudhari/securex-techqa-qwen3-8b-raft