Update README.md

a98e750 verified 22 days ago

2.04 kB

license: mit
language:
  - en
metrics:
  - accuracy
  - f1
tags:
  - sentiment-analysis
  - text-classification
  - bidirectional-lstm
  - keras
  - TensorFlow

Sentiment Analysis — BiLSTM (IMDB)

A Bidirectional LSTM classifier trained on the IMDB 50K Movie Reviews dataset. Classifies free-text reviews as Positive or Negative.

Model Details

Parameter	Value
Architecture	Embedding → BiLSTM → Dense(sigmoid)
Vocabulary size	20,000 tokens + `<OOV>`
Sequence length	300 (post-padding, post-truncation)
Total parameters	2,823,425 (~32 MB)
Framework	TensorFlow / Keras 2.15

Performance (IMDB test set, 10,000 samples)

Metric	Value
Accuracy	86.96%
Macro F1	0.87
Test loss	0.331

Usage

import pickle
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.sequence import pad_sequences
from huggingface_hub import hf_hub_download
REPO = "your-username/sentiment-bilstm-imdb"
model = load_model(hf_hub_download(REPO, "sentiment_analysis_model.h5"))
with open(hf_hub_download(REPO, "tokenizer.pickle"), "rb") as f:
    tokenizer = pickle.load(f)
with open(hf_hub_download(REPO, "max_seq_length.pickle"), "rb") as f:
    max_seq_length = pickle.load(f)
text = "This movie was absolutely fantastic!"
seq = tokenizer.texts_to_sequences([text.lower()])
padded = pad_sequences(seq, maxlen=max_seq_length, padding="post", truncating="post")
score = model.predict(padded)[0][0]
print("Positive" if score >= 0.5 else "Negative", f"({score:.4f})")

Training

Trained in Google Colab (v5e1-TPU). Full pipeline including data preprocessing, training code, and evaluation: GitHub Repository Dataset: IMDB 50K Movie Reviews — Maas et al. (2011), ACL.