Recommender system for IT security scanning service : Collaborative filtering in an error report scenario

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Recommender systems have become an integral part of the user interface of many web applications. Recommending items to buy, media to view or similar “next choice”-recommendations has proven to be a powerful tool to improve costumer experience and engagement. One common technique to produce recommendations called Collaborative Filtering makes use of the unsupervised Nearest Neighbor-algorithm, where a costumers historic use of a service is encoded as a vector and recommendations are made such that if followed the resulting behaviour-vector would lie closer to the nearest neighboring vectors encoding other costumers. This thesis describes the adaptation of a Collaborative Filtering recommender system to a cyber security vulnerability report setting with the goal of producing recommendations regarding which of a set of found vulnerabilities to prioritize for mitigation. Such an error report scenario presents idiosyncrasies that do not allow a direct application of common recommender system algorithms. This work was carried out in collaboration with the company Detectify, whose product allows users to check for vulnerabilities in their internet facing software, typically web pages and apps. The finding mitigation priorities of historic customers have to be inferred from differences in their consecutive reports, i.e. from noisy vector valued signals. Further, as opposed to the typical e-commerce or media streaming scenario, as a user can not freely choose which item to increase their consumption of, instead, a user can only attempt to decrease their inventory of a limited subset (the vulnerabilities in their report) of all items (all possible vulnerabilities). This thesis presents an adapted Collaborative Filtering algorithm applicable to this scenario. The chosen approach to the algorithm is motivated by an extensive literature review of the current state of the art of recommender systems. To measure the performance of the algorithm, test data is produced which allows for comparison between recommendations based on noisy data and the actual change in a noiseless version. The results that are showcased give reference values as to under what levels of noise and data sparsity the developed algorithm can be expected to produce recommendations that align well with historic behavioural patterns of other customers. This thesis thus provides a novel variation of the Collaborative Filtering algorithm that extends its usability to a scenario that has not been previously addressed in the reviewed literature.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)