Increased evasion resilience in modern PDF malware detectors : Using a more evasive training dataset

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: The large scale usage of the PDF coupled with its versatility has made the format an attractive target for carrying and deploying malware. Traditional antivirus software struggles against new malware and PDF's vast obfuscation options. In the search of better detection systems, machine learning based detectors have been developed. Although their approaches vary, some strictly examine structural features of the document whereas other examine the behavior of embedded code, they generally share high accuracy against the evaluation data they have been tested against. However, structural machine learning based PDF malware detectors have been found to be weak against targeted evasion attempts that may be found in more sophisticated malware. Such evasion attempts typically exploit knowledge of what the detection system associates with 'benign' and 'malicious' to emulate benign features or exploit a bug in the implementation, with the purpose of evading the detector. Since the introduction of such evasion attacks more structural detectors have been developed, without introducing mitigations against such evasion attacks. This thesis aggregates the existing knowledge of evasion strategies and applies them against a reproduction of a recent, not previously evasion tested, detection system and finds that it is susceptible to various evasion techniques. Additionally, the produced detector is experimentally trained with a combination of the standard data and the recently published CIC-Evasive-PDFMal2022 dataset which contains malware samples which display evasive properties. The evasive-trained detector is tested against the same set of evasion attacks. The results of the two detectors are compared, concluding that supplementing the training data with evasive samples results in a more evasion resilient detector.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)