Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts

University essay from Mälardalens högskola/Akademin för innovation, design och teknik

Author: Filip Lagerholm; [2017]

Keywords: ;

Abstract: The widespread use of social media, along with the possibilities to conceal one’s identity in the fibrillation of ubiquitous technology, combined with crime and terrorism becoming digitized, has increased the need of possibilities to find out who hides behind an anonymous alias. This report deals with authorship verification of posts written on Twitter, with the purpose of investigating whether it is possible to develop an auxiliary tool that can be used in crime investigation activities. The main research question in this report is whether a set of tweets written by an anonymous user can be matched to another set of tweets written by a known user, and, based on their linguistic styles, if it is possible to calculate a probability of whether the authors are the same. The report also examines the question of how linguistic styles can be extracted for use in an artificially intelligent classification, and how much data is needed to get adequate results. The subject matter is interesting as the work described in this report concerns a potential future scenario where digital crimes are difficult to investigate with traditional network-based tracking techniques. The approach to the problem is to evaluate traditional methods of feature extraction in natural language processing, and by classifying the features using a type of recurrent neural network called Long Short-Term Memory. While the best result in an experiment that was carried out achieved an accuracy of 93.32%, the overall results showed that the choice of representation, and amount of data used, is crucial. This thesis complements the existing knowledge as very short texts, in the form of social media posts, are in focus.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)