Anomaly Detection in Unstructured Time Series Datausing an LSTM Autoencoder

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Maxim Wolpher; [2018]

Abstract: An exploration of anomaly detection. Much work has been done on the topic of anomalyd etection, but what seems to be lacking is a dive into anomaly detection of unstructuredand unlabeled data. This thesis aims to determine the efctiveness of combining recurrentneural networks with autoencoder structures for sequential anomaly detection. The use of an LSTM autoencoder will be detailed, but along the way there will also be backgroundon time-independent anomaly detection using Isolation Forests and Replicator Neural Networks on the benchmark DARPA dataset. The empirical results in this thesis show that Isolation Forests and Replicator Neural Networks both reach an F1-score of 0.98. The RNN reached a ROC AUC score of 0.90 while the Isolation Forest reached a ROC AUC of 0.99. The results for the LSTM Autoencoder show that with 137 features extracted from the unstructured data, it can reach an F1 score of 0.8 and a ROC AUC score of 0.86

