Cooperative security log analysis using machine learning : Analyzing different approaches to log featurization and classification
Abstract: This thesis evaluates the performance of different machine learning approaches to log classification based on a dataset derived from simulating intrusive behavior towards an enterprise web application. The first experiment consists of performing attacks towards the web app in correlation with the logs to create a labeled dataset. The second experiment consists of one unsupervised model based on a variational autoencoder and four super- vised models based on both conventional feature-engineering techniques with deep neural networks and embedding-based feature techniques followed by long-short-term memory architectures and convolutional neural networks. With this dataset, the embedding-based approaches performed much better than the conventional one. The autoencoder did not perform well compared to the supervised models. To conclude, embedding-based ap- proaches show promise even on datasets with different characteristics compared to natural language.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)