Fine-tuning BERT with Bidirectional LSTM for Fine-grained Movie Reviews Sentiment Analysis

Nkhata, Gibson; Anjum, Usman; Zhan, Justin; Gauch, Susan

Home // International Journal On Advances in Systems and Measurements, volume 16, numbers 3 and 4, 2023 // View article

Fine-tuning BERT with Bidirectional LSTM for Fine-grained Movie Reviews Sentiment Analysis

Authors:
Gibson Nkhata
Usman Anjum
Justin Zhan
Susan Gauch

Keywords: Sentiment analysis, movie reviews, BERT, bidirectional LSTM, overall sentiment polarity

Abstract:
Sentiment Analysis (SA) is instrumental in understanding people’s viewpoints, facilitating social media monitoring, recognizing products and brands, and gauging customer satisfaction. Consequently, SA has evolved into an active research domain within Natural Language Processing (NLP). Many approaches outlined in the literature devise intricate frameworks aimed at achieving high accuracy, focusing exclusively on either binary sentiment classification or fine-grained sentiment classification. In this paper, our objective is to fine-tune the pre-trained BERT model with Bidirectional LSTM (BiLSTM) to enhance both binary and fine-grained SA specifically for movie reviews. Our approach involves conducting sentiment classification for each review, followed by computing the overall sentiment polarity across all reviews. We present our findings on binary classification as well as fine-grained classification utilizing benchmark datasets. Additionally, we implement and assess two accuracy improvement techniques, Synthetic Minority Oversampling Technique (SMOTE) and NLP Augmenter (NLPAUG), to bolster the model’s generalization in fine-grained sentiment classification. Finally, a heuristic algorithm is employed to calculate the overall polarity of predicted reviews from the BERT+BiLSTM output vector. Our approach performs comparably with state-of-the-art (SOTA) techniques in both classifications. For instance, in binary classification, we achieve 97.67% accuracy, surpassing the leading SOTA model, NB-weighted-BON+dv-cosine, by 0.27% on the renowned IMDb dataset. Conversely, for five-class classification on SST-5, while the top SOTA model, RoBERTa+large+Self-explaining, attains 55.5% accuracy, our model achieves 59.48% accuracy, surpassing the BERT-large baseline by 3.6%.

Pages: 116 to 129

Publication date: December 30, 2023

Published in: journal

ISSN: 1942-261x