Home // IMMM 2012, The Second International Conference on Advances in Information Mining and Management // View article


Applying of Sentiment Analysis for Texts in Russian Based on Machine Learning Approach

Authors:
Nafissa Yussupova
Diana Bogdanova
Maxim Boyko

Keywords: text analysis; analysis of tonality; sentiment analysis; machine learning

Abstract:
This paper considers the problem of Sentiment classification in text messages in Russian with using Machine Learning methods - Naive Bayes classifier and the Support Vector Machine. One of the features of the Russian language is using of a wide variety of declensional endings depending on the declination, tenses, grammatical gender. Another common problem of sentiment classification for different languages is that different words can have the same meaning (synonyms) and thus give equal emotional value. Therefore, our task was to evaluate on how the lemmatization affects the sentiment classification accuracy (or another, with endings and without them), and to compare the results for Russian and English languages. For evaluating the impact of synonymy, we used the approach when the words with the same meaning are grouping into a single term. To solve these problems we used lemmatization and synonyms libraries. The results showed that using lemmatization for texts in Russian improves the accuracy of sentiment classification. On the contrary, the sentiment classification of texts in English without using lemmatization yields better result. The results also showed that the use synonymy in the model has a positive influence on accuracy. In the "Introduction", we describe a place Sentiment Analysis in Data Mining. In the "Approaches to the Sentiment Analysis", we tell about the main approaches of Sentiment Analysis: linguistic approach, an approach based on Machine Learning, and their combination. In the "Description of algorithms for Sentiment Analysis", we state the problem of sentiment classification and describe methods for solving it using a Naïve Bayesian classifier, Bagging, Support Vector Machine. In the "Results of experiments", we describe aims of the experiment and the features of the implementation of the algorithm and report the results of the experiment. In the "Conclusion", we present the output from the results.

Pages: 8 to 14

Copyright: Copyright (c) IARIA, 2012

Publication date: October 21, 2012

Published in: conference

ISSN: 2326-9332

ISBN: 978-1-61208-227-1

Location: Venice, Italy

Dates: from October 21, 2012 to October 26, 2012