Cognitive Search Engine Optimization

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Joakim Edlund; [2020]

Keywords: ;

Abstract: The use of search engines is a common way to navigate through information today. The field of information retrieval is the field of finding documents in large unstructured collections. Within this field there are widely researched baseline solutions to solve this problem. There are also more advanced techniques (often based on machine learning) to improve relevant results further. However, picking the right algorithm or technique when implementing a search engine is no trivial task and deciding which performs better might seem hard. This project takes a commonly used baseline search engine implementation (elasticsearch) and measures its relevance score using standard measurements within the field of information retrieval (precision, recall, f-measure). After establishing a baseline configuration a query expansion algorithm (based on Word2Vec) is implemented in parallel with a recommendation algorithm (collaborative filtering) to compare against each other and the baseline configuration. Finally a combined model using both the query expansion algorithm and collaborative filtering is used to see if they can utilize each other’s strengths to make an even better setup. Findings show that both Word2Vec and collaborative filtering improves relevance over all three measurements (precision, recall, f-measure). These findings could also be confirmed to be significant through statistical analysis. Collaborative filtering seems to be performing better than Word2Vec for the topmost results while Word2Vec improves more the longer the result set is set to be. The combined model did show a significant improvement to all measurements for result sets of sizes 3 and 5 but larger result sets show less of an improvement and even worse performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)