Machine Learning Based Sentiment Classification of Text, with Application to Equity Research Reports
Abstract: In this thesis, we analyse the sentiment in equity research reports written by analysts at Skandinaviska Enskilda Banken (SEB). We provide a description of established statistical and machine learning methods for classifying the sentiment in text documents as positive or negative. Specifically, a form of recurrent neural network known as long short-term memory (LSTM) is of interest. We investigate two different labelling regimes for generating training data from the reports. Benchmark classification accuracies are obtained using logistic regression models. Finally, two different word embedding models and bidirectional LSTMs of varying network size are implemented and compared to the benchmark results. We find that the logistic regression works well for one of the labelling approaches, and that the best LSTM models outperform it slightly.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)