Near Real-time Detection of Masquerade attacks in Web applications : catching imposters using their browsing behavor

University essay from KTH/Kommunikationsnät

Author: Vasileios Panopoulos; [2016]

Keywords: Naive Bayes; SVM; Support Vector Machines; Machine Learning; IDS; Intrusion Detection System; Web Application; scikit-learn;

Abstract: This Thesis details the research on Machine Learning techniques that are central in performing Anomaly and Masquerade attack detection. The main focus is put on Web Applications because of their immense popularity and ubiquity. This popularity has led to an increase in attacks, making them the most targeted entry point to violate a system. Speciﬁcally, a group of attacks that range from identity theft using social engineering to cross site scripting attacks, aim at exploiting and masquerading users. Masquerading attacks are even harder to detect due to their resemblance with normal sessions, thus posing an additional burden. Concerning prevention, the diversity and complexity of those systems makes it harder to deﬁne reliable protection mechanisms. Additionally, new and emerging attack patterns make manually conﬁgured and Signature based systems less eﬀective with the need to continuously update them with new rules and signatures. This leads to a situation where they eventually become obsolete if left unmanaged. Finally the huge amount of traﬃc makes manual inspection of attacks and False alarms an impossible task. To tackle those issues, Anomaly Detection systems are proposed using powerful and proven Machine Learning algorithms. Gravitating around the context of Anomaly Detection and Machine Learning, this Thesis initially deﬁnes several basic deﬁnitions such as user behavior, normality and normal and anomalous behavior. Those deﬁnitions aim at setting the context in which the proposed method is targeted and at deﬁning the theoretical premises. To ease the transition into the implementation phase, the underlying methodology is also explained in detail. Naturally, the implementation is also presented, where, starting from server logs, a method is described on how to pre-process the data into a form suitable for classiﬁcation. This preprocessing phase was constructed from several statistical analyses and normalization methods (Univariate Selection, ANOVA) to clear and transform the given logs and perform feature selection. Furthermore, given that the proposed detection method is based on the source and1request URLs, a method of aggregation is proposed to limit the user privacy and classiﬁer over-ﬁtting issues. Subsequently, two popular classiﬁcation algorithms (Multinomial Naive Bayes and Support Vector Machines) have been tested and compared to deﬁne which one performs better in our given situations. Each of the implementation steps (pre-processing and classiﬁcation) requires a number of diﬀerent parameters to be set and thus a method called Hyper-parameter optimization is deﬁned. This method searches for the parameters that improve the classiﬁcation results. Moreover, the training and testing methodology is also outlined alongside the experimental setup. The Hyper-parameter optimization and the training phases are the most computationally intensive steps, especially given a large number of samples/users. To overcome this obstacle, a scaling methodology is also deﬁned and evaluated to demonstrate its ability to handle larger data sets. To complete this framework, several other options have been also evaluated and compared to each other to challenge the method and implementation decisions. An example of this, is the "Transitions-vs-Pages" dilemma, the block restriction eﬀect, the DR usefulness and the classiﬁcation parameters optimization. Moreover, a Survivability Analysis is performed to demonstrate how the produced alarms could be correlated aﬀecting the resulting detection rates and interval times. The implementation of the proposed detection method and outlined experimental setup lead to interesting results. Even so, the data-set that has been used to produce this evaluation is also provided online to promote further investigation and research on this ﬁeld.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Near Real-time Detection of Masquerade attacks in Web applications : catching imposters using their browsing behavor

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-24)