Detecting PowerShell Obfuscation Techniques using Natural Language Processing

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: PowerShell obfuscation is often used to avoid getting detected by Anti Virus programs. There are several different techniques to change a PowerShell script and still perform the same tasks. Detecting these obfuscated files is a good addition in order to detect malicious files. Identifying the specific technique used can also be beneficial for an analyst tasked with investigating the detected files. In order to detect these different techniques we are using Natural Language Processing with the idea that each technique will be sort of like a unique language that can be detected. We tried several different models and iterations of data processing and ended up using a Random Forest Classifier and achieved a detection accuracy of 98%.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)