Detecting changes in word associations over short time periods : Analysing Twitter data with Word2Vec over time
Abstract: The meanings and connotations of words are constantly changing. Traditionally, one way to track such changes over relatively long time periods is by analysing variations in word usage in written records such as books and newspapers, and by comparing dictionary entries for the words of interest. For short time periods, however, these methods are rendered unsuitable due to the limited amount of such language data. In this thesis, we explore a method to detect changes in word associations over short time periods by analysing word usage on Twitter. A large amount of tweets (roughly 3 million) were gathered during a 1-month-period, and were then analysed using (Word2Vec-) word embeddings, a technique from the field of Natural Language Processing (NLP). Words associated to the ongoing coronavirus pandemic were chosen for analysis with the assumption that their associations would change during the measurement period. Chiefly, word associations between the words 'corona' and 'bleach' and their most highly associated words during our measurement period were analysed over time. The results show a stark shift in connotation for the word 'bleach' at the time of Donald Trump's statement regarding ingestion of disinfectant as a possible way to combat the virus, made on April 23rd, 2020. They also show fluctuations for associations to 'corona', for example a peak in similarity for the word '5g' as conspiracy theories about its possible connection to the pandemic blossomed in news media outlets. Our conclusion is that the method could be suitable for detecting short-term changes in word associations over time, but also that more data would need to be analysed to make the results more reliable.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)