Home // eKNOW 2021, The Thirteenth International Conference on Information, Process, and Knowledge Management // View article


Optimizing Statistical Distance Measures in Multivariate SVM for Sentiment Quantification

Authors:
Kevin Labille
Susan Gauch

Keywords: sentiment quantification, sentiment lexicon, multivariate SVM, statistical distances

Abstract:
Twitter sentiment classification has been widely investigated in recent years and it is today possible to accurately determine the class label of a single tweet through various approaches. Although it could open new horizons for business or research, Twitter sentiment quantification, which aims to predict the prevalence of the positive class and the negative class within a set of tweets, has drawn much less attention. This paper presents our research on improving lexicon-based Twitter sentiment quantification. We first introduce a new approach to building a paired-score sentiment lexicon that is better suited for sentiment quantification. We then propose a novel feature vector representation for tweets that incorporates a collection of sentiment features. Finally, we investigate and compare several statistical distance kernels in multivariate Support Vector Machine for sentiment quantification. Results suggest that optimizing the Hellinger Distance with a multivariate SVM using our new sentiment lexicon outperforms current sentiment quantification approaches, including neural network approaches.

Pages: 57 to 64

Copyright: Copyright (c) IARIA, 2021

Publication date: July 18, 2021

Published in: conference

ISSN: 2308-4375

ISBN: 978-1-61208-874-7

Location: Nice, France

Dates: from July 18, 2021 to July 22, 2021