When words are not enoughAn evaluation of character n-grams and function words in author identification of musical artists

University essay from Umeå universitet/Institutionen för datavetenskap

Author: Alexander Nyström; [2018]

Keywords: ;

Abstract: When we write texts we unconsciously leave prints behind, these prints are things such as the words used, punctuation, special characters and more. There are several different approaches to author identification that utilises these features. All these methods have been applied to avariety of texts, everything from papers to poems, e-mail and forum posts. This study will use lyrics where the artists are the authors, on these the performance of two common features will be compared.The two features that will get evaluated are character n-grams and function words. These are some of the most prominent features within author identification, where both have a track record of good performance. With high hopes for the performance the results showed that neither feature could reach the expected results. They were expected to achieve 70% and 65% accuracy respectively, however, the achieved average accuracy was only 40% and 35%. Even with the poor results some interesting finds were made. Some artists would have multiple band members write the songs which caused concern that it would affect the performance. Interestingly the results showed that multiple authors did not bad effect to the performance, in some cases they performed better than single authors.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)