TheLastOfUs-QA: Fine-tuned Model on Qwen2.5-7B-Instructed for The Last of Us
This model is a fine-tuned version of the base model Qwen2.5-7B-Instructed, specifically adapted to answer questions and generate text related to the universe of The Last of Us.
Description
The model was trained to understand and generate content about the story, characters, events, and lore of the video game The Last of Us. Thanks to fine-tuning with the specialized dataset the-last-of-us-instruction-dataset, this model is capable of providing coherent and detailed answers to any query about this universe.
This model is ideal for:
Creating conversational assistants that answer questions about The Last of Us.
Generate narrative or explanatory content based on the game's lore.
Support creative projects related to the post-apocalyptic world of The Last of Us.
Training Dataset
The model was trained using the the-last-of-us-instruction-dataset, a custom dataset containing instructions and questions about the game's universe, as well as answers based on the official narrative and key story elements.
Training Details
- Base model: Qwen/Qwen2.5-7B-Instruct
- Method: QLoRA (4-bit) + PEFT
LoRA
- r=16, alpha=32, dropout=0.05
- target: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training
- epochs=3, lr=1e-4, scheduler=cosine
- batch=4, grad_accum=4 (effective=16)
- warmup=0.03
Optimization
- optimizer: paged_adamw_8bit
- bf16 + gradient checkpointing
Quantization
- 4-bit (nf4), double quant, bfloat16 compute
Eval & Saving
- eval/save: each epoch
- best model: eval_loss
LoRA Merge
After fine-tuning, the LoRA adapters were merged into the base model weights.
Why merge?
Merging the LoRA adapters has several advantages:
- Simpler usage: The model can be used directly without loading additional adapters.
- Better compatibility: Works seamlessly with standard inference pipelines.
- Easier deployment: No need to manage separate LoRA weights.
- Improved portability: A single model file is easier to share and integrate.
Notes
- The performance is equivalent to using the LoRA adapters during inference.
- This repository provides the fully merged model, ready for immediate use.
Hardware
The model was fine-tuned using:
- GPU: NVIDIA T4
- Precision: bfloat16 + 4-bit quantization
- Frameworks:
- Transformers
- PEFT
- TRL (SFTTrainer)
- BitsAndBytes
Training Efficiency
Thanks to QLoRA and 4-bit quantization:
- Only a small percentage of parameters were trained (LoRA adapters)
- Reduced VRAM usage, enabling training on a single GPU
- Maintained strong performance while being computationally efficient
Prompt Format
This model follows a chat-based format using roles:
- system
- user
- assistant
Example:
messages = [ {"role": "system", "content": "You are an expert on The Last of Us"}, {"role": "user", "content": "Who is Ellie?"} ]
Example of Use
You can load the model directly with Transformers:
from transformers import pipeline, AutoTokenizer
MODEL_NAME = "adriangg04/TheLastOfUs-QA"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
pipe = pipeline(
"text-generation",
model=MODEL_NAME,
tokenizer=tokenizer,
device_map="auto"
)
# Prompt de prueba simple
messages = [
{"role": "system", "content": "You are an expert on The Last of Us"},
{"role": "user", "content": "What is the main reason for Ellie's journey to Seattle in The Last of Us?"}
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
response = pipe(
prompt,
max_new_tokens=200,
temperature=0.5
)
answer = response[0]["generated_text"]
print("Prompt:", messages[1]["content"])
print("Response:", answer)
Disclaimer: This model is not affiliated with, endorsed by, or approved by Naughty Dog, Sony Interactive Entertainment, or PlayStation. All content related to The Last of Us is used solely for professional and research purposes. Copyrights and trademarks belong to their respective owners.
- Downloads last month
- 124
Model tree for adriangg04/TheLastOfUs-QA
Dataset used to train adriangg04/TheLastOfUs-QA
Spaces using adriangg04/TheLastOfUs-QA 2
Evaluation results
- Evaluation Loss on the-last-of-us-instruction-datasetself-reported1.011
- Evaluation Entropy on the-last-of-us-instruction-datasetself-reported1.011