Comparing performance of K-Means and DBSCAN on customer support queries

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: In customer support, there are often a lot of repeat questions, and questions that does not need novel answers. In a quest to increase the productivity in the question answering task within any business, there is an apparent room for automatic answering to take on some of the workload of customer support functions. We look at clustering corpora of older queries and texts as a method for identifying groups of semantically similar questions and texts that would allow a system to identify new queries that fit a specific cluster to receive a connected, automatic response. The approach compares the performance of K-means and density-based clustering algorithms on three different corpora using document embeddings encoded with BERT. We also discuss the digital transformation process, why companies are unsuccessful in their implementation as well as the possible room for a new more iterative model.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)