Sentiment Classification Techniques Applied to Swedish Tweets Investigating the Effects of translation on Sentiments from Swedish into English

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Mona Dadoun; Daniel Olssson; [2016]

Keywords: ;

Abstract: Sentiment classification is generally used for many purposes such as business related aims and opinion gathering. In overall, since most text sources in the world wide web were written in English, available sentiments classifiers were trained on datasets written in English but rarely in other languages. This raised a curiosity and interest in investigating Sentiment Classification methods to implement on Swedish data. Therefor, this bachelor thesis examined to what extent the connotation of Swedish sentiments would be maintained/retained when translated into English. The research question was investigated by comparing the results given by applying Sentiment Classifications techniques. Further, an investigation of the outcomes of a combination of a lexicon based approach and a machine learning based approach by using machine translation on Swedish Tweets was made. The source data was in Swedish and gathered from Twitter, a naive lexicon based approach was used to score the polarity of the Tweets word by word and then a sum of polaritie was calculated.The swedish source data was translated into English, it was run through a supervised machine learning based classifier to where it was scored. In short, the outcomes of this investigation have shown promising results e.g. the translation did not affect the sentiments in a text but rather other circumstances did. These other circumstances was mostly due to cross-lingual sentiment classification problems and supervised machine learning classifiers character.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)