--- license: mit language: - en metrics: - accuracy - f1 tags: - sentiment-analysis - text-classification - bidirectional-lstm - keras - TensorFlow --- # Sentiment Analysis — BiLSTM (IMDB) A Bidirectional LSTM classifier trained on the IMDB 50K Movie Reviews dataset. Classifies free-text reviews as **Positive** or **Negative**. ## Model Details | Parameter | Value | |-----------|-------| | Architecture | Embedding → BiLSTM → Dense(sigmoid) | | Vocabulary size | 20,000 tokens + `` | | Sequence length | 300 (post-padding, post-truncation) | | Total parameters | 2,823,425 (~32 MB) | | Framework | TensorFlow / Keras 2.15 | ## Performance (IMDB test set, 10,000 samples) | Metric | Value | |--------|-------| | Accuracy | 86.96% | | Macro F1 | 0.87 | | Test loss | 0.331 | ![output](https://cdn-uploads.huggingface.co/production/uploads/6732537a79aeec81e242bc11/0sf2R1FEJ61UOBuIgxm-X.png) ## Usage ```python import pickle from tensorflow.keras.models import load_model from tensorflow.keras.preprocessing.sequence import pad_sequences from huggingface_hub import hf_hub_download REPO = "your-username/sentiment-bilstm-imdb" model = load_model(hf_hub_download(REPO, "sentiment_analysis_model.h5")) with open(hf_hub_download(REPO, "tokenizer.pickle"), "rb") as f: tokenizer = pickle.load(f) with open(hf_hub_download(REPO, "max_seq_length.pickle"), "rb") as f: max_seq_length = pickle.load(f) text = "This movie was absolutely fantastic!" seq = tokenizer.texts_to_sequences([text.lower()]) padded = pad_sequences(seq, maxlen=max_seq_length, padding="post", truncating="post") score = model.predict(padded)[0][0] print("Positive" if score >= 0.5 else "Negative", f"({score:.4f})") ``` ## Training Trained in Google Colab (v5e1-TPU). Full pipeline including data preprocessing, training code, and evaluation: [GitHub Repository](https://github.com/itsalivafaei/sentiment) Dataset: [IMDB 50K Movie Reviews](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews) — Maas et al. (2011), ACL.