---
license: mit
language:
- en
metrics:
- accuracy
- f1
tags:
- sentiment-analysis
- text-classification
- bidirectional-lstm
- keras
- TensorFlow
---

# Sentiment Analysis — BiLSTM (IMDB)
A Bidirectional LSTM classifier trained on the IMDB 50K Movie Reviews dataset.
Classifies free-text reviews as **Positive** or **Negative**.
## Model Details
| Parameter | Value |
|-----------|-------|
| Architecture | Embedding → BiLSTM → Dense(sigmoid) |
| Vocabulary size | 20,000 tokens + `<OOV>` |
| Sequence length | 300 (post-padding, post-truncation) |
| Total parameters | 2,823,425 (~32 MB) |
| Framework | TensorFlow / Keras 2.15 |
## Performance (IMDB test set, 10,000 samples)
| Metric | Value |
|--------|-------|
| Accuracy | 86.96% |
| Macro F1 | 0.87 |
| Test loss | 0.331 |


![output](https://cdn-uploads.huggingface.co/production/uploads/6732537a79aeec81e242bc11/0sf2R1FEJ61UOBuIgxm-X.png)

## Usage
```python
import pickle
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.sequence import pad_sequences
from huggingface_hub import hf_hub_download
REPO = "your-username/sentiment-bilstm-imdb"
model = load_model(hf_hub_download(REPO, "sentiment_analysis_model.h5"))
with open(hf_hub_download(REPO, "tokenizer.pickle"), "rb") as f:
    tokenizer = pickle.load(f)
with open(hf_hub_download(REPO, "max_seq_length.pickle"), "rb") as f:
    max_seq_length = pickle.load(f)
text = "This movie was absolutely fantastic!"
seq = tokenizer.texts_to_sequences([text.lower()])
padded = pad_sequences(seq, maxlen=max_seq_length, padding="post", truncating="post")
score = model.predict(padded)[0][0]
print("Positive" if score >= 0.5 else "Negative", f"({score:.4f})")
```

## Training
Trained in Google Colab (v5e1-TPU).
Full pipeline including data preprocessing, training code, and evaluation: [GitHub Repository](https://github.com/itsalivafaei/sentiment)
Dataset: [IMDB 50K Movie Reviews](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews) — Maas et al. (2011), ACL.