Tracking the Time Trends of Swedish Literature and Finding Characteristics of Authors by Using Topic Modelling

University essay from Uppsala universitet/Institutionen för lingvistik och filologi

Abstract: In this thesis, we discover the time trends in Swedish literature and characteristics of authors. We apply latent Dirichlet allocation (LDA), a method for topic modelling, to a corpus composed of 118 Swedish books and prose collected in Litteraturbanken. By using the LDA model, we observe two findings: topics that focus on daily life, such as nature or family are frequently observed in the corpus, and peaks of topics in time trends result from books on the same topic written by several authors or books written by an author in a short time. Additionally, LDA is applicable to assessments of the characteristics of authors. We list the particular topics for nine authors with more than three books in the corpus by comparing the topic distribution of those authors to the topic distribution of the entire corpus. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)