Learning Centre

Detecting Obfuscated Scripts With Machine Learning Techniques

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.advisor Aura, Tuomas
dc.contributor.advisor Kraemer, Frank Alexander
dc.contributor.author Pogosova, Mariam
dc.date.accessioned 2020-03-22T18:07:40Z
dc.date.available 2020-03-22T18:07:40Z
dc.date.issued 2020-03-19
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/43575
dc.description.abstract Complex operating system administration tasks can be automated and simplified by using scripting languages. For the Windows operating system, one of the most commonly used scripting languages is PowerShell. The PowerShell scripting language provides vast functionality for the system administrators. At the same time, it leaves a large attack surface for adversaries to bypass the OS protections. Signature and supervised machine learning based intrusion detection systems (IDS) can be used for monitoring and detecting such malicious scripts. However, the detection can be evaded by obfuscating the scripts. As the next step in the defense, we can use obfuscation itself as a reliable sign of malicious code. This thesis investigates the methods of detecting obfuscated PowerShell scripts with machine learning (ML) techniques. We trained the logistic regression, random forest and gradient boosting models on a balanced dataset. To generate the dataset, unobfuscated scripts were taken from open-source projects and they were obfuscated by open-source obfuscators. We then selected the most important independent features for obfuscation detection. The ML methods were compared using their ROC curves and AUC values. The best method turns out to be the gradient boosting model, which has the AUC close to one for the used dataset. Moreover, the model can classify a script faster than in one millisecond. Thus, the model can replace existing approaches to obfuscation detection, and it can be used by antivirus vendors in the process of detecting malicious PowerShell scripts. en
dc.format.extent 50+5
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.title Detecting Obfuscated Scripts With Machine Learning Techniques en
dc.type G2 Pro gradu, diplomityö fi
dc.contributor.school Perustieteiden korkeakoulu fi
dc.subject.keyword machine learning en
dc.subject.keyword PowerShell en
dc.subject.keyword malware en
dc.subject.keyword scripting language en
dc.subject.keyword obfuscation detection en
dc.identifier.urn URN:NBN:fi:aalto-202003222608
dc.programme.major Security and Mobile Computing fi
dc.programme.mcode T3011 fi
dc.type.ontasot Master's thesis en
dc.type.ontasot Diplomityö fi
dc.contributor.supervisor Aura, Tuomas
dc.programme Master's Degree Programme in Security and Mobile Computing (NordSecMob) fi
local.aalto.electroniconly yes
local.aalto.openaccess yes


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search archive


Advanced Search

article-iconSubmit a publication

Browse

Statistics