--- license: mit language: en tags: - gemma3 - rlhf - dpo - slm - tinystories - alignment model_type: gemma3 --- # Gemma3 270M: DPO Aligned for Negative Sentiment Control This repository contains a **DPO-aligned version** of the Gemma3-270M model. While the base model was trained on the **TinyStories** dataset to generate neutral or positive narratives, this version has been fine-tuned using **Direct Preference Optimization (DPO)** to steer its generation toward negative emotional outcomes, melancholy tones, and "unhappy endings." [Github Repo Link](https://github.com/ShubhamWaghmare11/RLHF-Gemma3-DPO-Alignment) ## Model Lineage & Alignment This model is a secondary iteration of the original SFT (Supervised Fine-Tuning) checkpoint. The transition from the base model to this version was achieved through an **RLHF-based pipeline**: - **Base Model:** Gemma3-270M (SFT Checkpoint) - **Tuning Method:** Direct Preference Optimization (DPO) - **Alignment Goal:** To shift the model's stochastic output toward a "Negative Sentiment" persona. - **Reference Anchor:** The original SFT weights were used as a frozen reference to calculate the log-probability ratio, preventing catastrophic forgetting of the base language distribution. ## Architecture Specifications The model utilizes a custom implementation of the Gemma3 architecture: - **Parameters:** 270M (18 Transformer Blocks) - **Attention:** Grouped Query Attention (GQA) with 1 KV group. - **Windowing:** Sliding Window Attention (SWA) with a 512-token span. - **Positional Encoding:** Rotary Positional Embeddings (RoPE). - **Context Window:** 32,768 tokens (trained with 128-token block size). ## Training & Hardware - **Dataset:** Preference-paired subset of TinyStories (Chosen: Negative / Rejected: Positive). - **Optimizer:** AdamW with Linear Warmup and Cosine Decay. - **Hardware:** Single NVIDIA A100 GPU (40GB). - **Development Context:** This project was developed at **Tunica Tech** as a case study in Small Language Model (SLM) alignment and Reinforcement Learning. # Requirements pip install git+https://huggingface.co/Shubhamw11/Gemma-270M-TinyStories ## How to use ```python from gemma3_tinystories import HFGemma3DPONegative, Gemma3Config import tiktoken import torch # Load Aligned Model config = Gemma3Config.from_pretrained("Shubhamw11/gemma-3-270m-dpo-negative") model = HFGemma3DPONegative.from_pretrained("Shubhamw11/gemma-3-270m-dpo-negative", config=config).model tokenizer = tiktoken.get_encoding("gpt2") ``` ## Generate text ```python device = "cuda" if torch.cuda.is_available() else "cpu" input_text = "Once upon a time, there was a little" context = torch.tensor(tokenizer.encode(input_text), dtype=torch.long).unsqueeze(0).to(device) model.to(device) response = model.generate(context, max_new_tokens=200, temperature=1.1, top_k=5) print(tokenizer.decode(response.squeeze().tolist())) ```