Sentiment Analysis and Time-series Analysis for the COVID-19 vaccine Tweets

University essay from Blekinge Tekniska Högskola/Institutionen för datavetenskap

Abstract: Background: The implicit nature of social media information brings many advantages to realistic sentiment analysis applications. Sentiment Analysis is the process of extracting opinions and emotions from data. As a research topic, sentiment analysis of Twitter data has received much attention in recent years. In this study, we have built a model to perform sentiment analysis to classify the sentiments expressed in the Twitter dataset based on the public tweets to raise awareness of the public's concerns by training the models. Objectives: The main goal of this thesis is to develop a model to perform a sentiment analysis on the Twitter data regarding the COVID-19 vaccine and find out the sentiment’s polarity from the data to show the distribution of the sentiments as following: positive, negative, and neutral. A literature study and an experiment are set to identify a suitable approach to develop such a model. Time-series analysis performed to obtain daily sentiments over the timeline series and daily trend analysis with events associated with the particular dates. Method: A Systematic Literature Review is performed to identify the most suitable approach to accomplish the sentiment analysis on the COVID-19 vaccine. Then, through the literature study results, an experimental model is developed to distribute the sentiments on the analyzed data and identify the daily sentiments over the timeline series. Result: A VADER is identified from the Literature study, which is the best suitable approach to perform the sentiment analysis. The KDE distribution is determined for each sentiment as obtained by the VADER Sentiment Analyzer. Daily sentiments over the timeline series are generated to identify the trend analysis on Twitter data of the COVID-19 vaccine. Conclusion: This research aims to identify the best-suited approach for sentiment analysis on Twitter data concerning the selected dataset through the study of results. The VADER model prompts optimal results among the sentiments polarity score for the sentiment analysis of Twitter data regarding the selected dataset. The time-series analysis shows how daily sentiments are fluctuant and the daily counts. Seasonal decomposition outcomes speak about how the world is reacting towards the current COVID-19 situation and daily trend analysis elaborates on the everyday sentiments of people.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)