---
license: mit
language: en
tags:
- gemma3
- rlhf
- dpo
- slm
- tinystories
- alignment
model_type: gemma3
---

# Gemma3 270M: DPO Aligned for Negative Sentiment Control

This repository contains a **DPO-aligned version** of the Gemma3-270M model. While the base model was trained on the **TinyStories** dataset to generate neutral or positive narratives, this version has been fine-tuned using **Direct Preference Optimization (DPO)** to steer its generation toward negative emotional outcomes, melancholy tones, and "unhappy endings."


[Github Repo Link](https://github.com/ShubhamWaghmare11/RLHF-Gemma3-DPO-Alignment)

## Model Lineage & Alignment

This model is a secondary iteration of the original SFT (Supervised Fine-Tuning) checkpoint. The transition from the base model to this version was achieved through an **RLHF-based pipeline**:

- **Base Model:** Gemma3-270M (SFT Checkpoint)
- **Tuning Method:** Direct Preference Optimization (DPO)
- **Alignment Goal:** To shift the model's stochastic output toward a "Negative Sentiment" persona.
- **Reference Anchor:** The original SFT weights were used as a frozen reference to calculate the log-probability ratio, preventing catastrophic forgetting of the base language distribution.

## Architecture Specifications

The model utilizes a custom implementation of the Gemma3 architecture:

- **Parameters:** 270M (18 Transformer Blocks)
- **Attention:** Grouped Query Attention (GQA) with 1 KV group.
- **Windowing:** Sliding Window Attention (SWA) with a 512-token span.
- **Positional Encoding:** Rotary Positional Embeddings (RoPE).
- **Context Window:** 32,768 tokens (trained with 128-token block size).

## Training & Hardware

- **Dataset:** Preference-paired subset of TinyStories (Chosen: Negative / Rejected: Positive).
- **Optimizer:** AdamW with Linear Warmup and Cosine Decay.
- **Hardware:** Single NVIDIA A100 GPU (40GB).
- **Development Context:** This project was developed at **Tunica Tech** as a case study in Small Language Model (SLM) alignment and Reinforcement Learning.

# Requirements
pip install git+https://huggingface.co/Shubhamw11/Gemma-270M-TinyStories


## How to use

```python
from gemma3_tinystories import HFGemma3DPONegative, Gemma3Config
import tiktoken
import torch

# Load Aligned Model
config = Gemma3Config.from_pretrained("Shubhamw11/gemma-3-270m-dpo-negative")
model = HFGemma3DPONegative.from_pretrained("Shubhamw11/gemma-3-270m-dpo-negative", config=config).model
tokenizer = tiktoken.get_encoding("gpt2")
```

## Generate text
```python
device = "cuda" if torch.cuda.is_available() else "cpu"

input_text = "Once upon a time, there was a little"
context = torch.tensor(tokenizer.encode(input_text), dtype=torch.long).unsqueeze(0).to(device)
model.to(device)
response = model.generate(context, max_new_tokens=200, temperature=1.1, top_k=5)

print(tokenizer.decode(response.squeeze().tolist()))
```