Document Expansion for Swedish Information Retrieval Systems

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Information retrieval systems have come to change how users interact with computerized systems and locate information. A major challenge when designing these systems is how to handle the vocabulary mismatch problem, i.e. that users, when formulating queries, pick different words than those present in the relevant documents that should be retrieved. With recent advances in artificial intelligence and the emergence of transformer-based language models, new methods have been proposed to alleviate this problem. One such method is the usage of document expansion models which append words to each document that are likely to be part of users’ queries. As previous research on document expansion models has been focused on English-language applications, this thesis investigates the effectiveness of one such model for Swedish applications. Although no improvement was found when using this method, the result is likely to be a consequence of dataset quality and domain rather than the method itself.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)