Re-ranking search results with KB-BERT

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: This master thesis aims to determine if a Swedish BERT model can improve a BM25 search by re-ranking the top search results. We compared a standard BM25 search algorithm with a more complex algorithm composed of a BM25 search followed by re-ranking the top 10 results by a BERT model. The BERT model used is KB-BERT, a publicly available neural network model built by the National Library of Sweden. We fine-tuned this model to solve the specific task of evaluating the relevancy of search results. A new Swedish search evaluation dataset was automatically generated from Wikipedia text to compare the algorithms. The search evaluation dataset is a standalone product and can be beneficial for evaluating other search algorithms on Swedish text in the future. The comparison of the two algorithms resulted in a slightly better ranking for the BERT re-ranking algorithm. These results align with similar studies using an English BERT and an English search evaluation dataset. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)