Llama-2 7b Sentiment-FineTuned

A fine-tuned Llama 2 7B model for multiclass sentiment analysis (positive, neutral, negative) of news headlines.

Model Description

This model is a fine-tuned version of Meta's Llama-2-7B-hf using Parameter-Efficient Fine-Tuning (PEFT) with LoRA adapters. The model has been specifically trained to classify sentiment in news headlines as positive, neutral, or negative. It uses 4-bit quantization for efficient inference and training.

Developed by: Harsh Shinde
Model type: Causal Language Model (Fine-tuned for Sentiment Analysis)
Language(s): English
License: Llama 2 Community License
Finetuned from model: meta-llama/Llama-2-7b-hf

Use

This model is designed for sentiment analysis of news headlines and similar short-form text. It can classify text into three categories:

Positive: Optimistic, favorable sentiment
Neutral: Objective, factual sentiment
Negative: Pessimistic, unfavorable sentiment

Ideal use cases include:

News sentiment monitoring
Social media sentiment analysis
Market sentiment analysis from headlines
Content categorization systems

Training Hyperparameters

LoRA Configuration:

LoRA rank (r): 64
LoRA alpha: 16
LoRA dropout: 0.1
Target modules: All linear layers (via PEFT auto-detection)
Bias: none
Task type: CAUSAL_LM

Training Arguments:

Number of epochs: 3
Per-device train batch size: 1
Gradient accumulation steps: 8
Effective batch size: 8
Optimizer: paged_adamw_32bit
Learning rate: 2e-4
Weight decay: 0.001
Learning rate scheduler: cosine
Warmup ratio: 0.03
Max gradient norm: 0.3
Training precision: bf16 (bfloat16)
Evaluation strategy: epoch
Logging steps: 25
Group by length: True

Quantization:

4-bit quantization using BitsAndBytes
Quantization type: nf4 (NormalFloat4)
Compute dtype: float16
Double quantization: False

Results

The fine-tuned model achieves the following performance on the test set (900 samples):

Overall Performance:

Accuracy: 67.89%
F1-Score (macro): 67.62%
Precision (weighted): 67.55%
Recall (weighted): 67.89%

Per-Class Performance:

Sentiment	Precision	Recall	F1-Score	Support
Negative	0.70	0.78	0.74	300
Neutral	0.57	0.52	0.54	300
Positive	0.75	0.74	0.75	300

Key Observations:

Strongest performance on positive sentiment (F1: 0.75) and negative sentiment (F1: 0.74)
Neutral sentiment is more challenging (F1: 0.54), which is common in sentiment analysis tasks
Balanced performance with consistent precision-recall trade-offs across classes

Detailed predictions available in test_predictions.csv

Summary

The model successfully learns to classify news headline sentiments with high accuracy. The LoRA fine-tuning approach enables efficient adaptation of Llama 2 7B for this specific task while maintaining model quality and requiring minimal computational resources.

Compute Infrastructure

Hardware

GPU: NVIDIA Tesla P100 or T4 (Kaggle environment)
Memory: 16GB GPU RAM
Quantization: 4-bit (NF4) to fit in memory

Software

Framework: PyTorch
Libraries:
- transformers - Hugging Face Transformers
- peft - Parameter-Efficient Fine-Tuning
- trl - Transformer Reinforcement Learning (SFTTrainer)
- bitsandbytes - 4-bit quantization
- datasets - Dataset loading
- wandb - Experiment tracking
Python Version: 3.10+
CUDA: Compatible with PyTorch CUDA support

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for harshinde/Llama-2-7b-sentiment-finetuned

Base model

meta-llama/Llama-2-7b-hf

Adapter

(2342)

this model

Dataset used to train harshinde/Llama-2-7b-sentiment-finetuned

Evaluation results

Accuracy on multiclass-sentiment-analysis-dataset
test set self-reported

0.679
F1 Score (macro) on multiclass-sentiment-analysis-dataset
test set self-reported

0.676
Precision (weighted) on multiclass-sentiment-analysis-dataset
test set self-reported

0.675
Recall (weighted) on multiclass-sentiment-analysis-dataset
test set self-reported

0.679