Evaluating Statistical MachineLearning and Deep Learning Algorithms for Anomaly Detection in Chat Messages
Abstract: Automatically detecting anomalies in text is of great interest for surveillance entities as vast amounts of data can be analysed to find suspicious activity. In this thesis, three distinct machine learning algorithms are evaluated as a chat message classifier is being implemented for the purpose of market surveillance. Naive Bayes and Support Vector Machine belong to the statistical class of machine learning algorithms being evaluated in this thesis and both require feature selection, a side objective of the thesis is thus to find a suitable feature selection technique to ensure mentioned algorithms achieve high performance. Long Short-Term Memory network is the deep learning algorithm being evaluated in the thesis, rather than depend on feature selection, the deep neural network will be evaluated as it is trained using word embeddings. Each of the algorithms achieved high performance but the findings ofthe thesis suggest Naive Bayes algorithm in conjunction with a feature counting feature selection technique is the most suitable choice for this particular learning problem.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)