Anomaly Detection in Log Files Using Machine Learning

University essay from Luleå tekniska universitet/Institutionen för system- och rymdteknik

Abstract: Logs generated by the applications, devices, and servers contain information that can be used to determine the health of the system. Manual inspection of logs is important, for example during upgrades, to determine whether the upgrade and data migration were successful. However, manual testing is not reliable enough, and manual inspection of logs is tedious and time-­consuming. In this thesis, we propose to use the machine learning techniques K­means and DBSCAN to find anomaly sequences in log files. This research also investigated two different kinds of data representation techniques, feature vector representation, and IDF representation. Evaluation metrics such as F1 score, recall, and precision were used to analyze the performance of the applied machine learning algorithms. The study found that the algorithms have large differences regarding detection of anomalies, in which the algorithms performed better in finding the different kinds of anomalous sequences, rather than finding the total amount of them. The result of the study could help the user to find anomalous sequences, without manually inspecting the log file.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)