Skip to Main Content
Malware landscape has been dramatically elevated over the last decade. The main reason of the increase is that new malware variants can be produced easily using simple code obfuscation techniques. Once the obfuscation is applied, the malware can change their syntactics while preserving semantics, and bypass anti-virus (AV) scanners. Malware authors, thus, commonly use the code obfuscation techniques to generate metamorphic malware. Nevertheless, signature based AV techniques are limited to detect the metamorphic malware since they are commonly based on the syntactic signature matching. In this paper, we propose BinGraph, a new mechanism that accurately discovers metamorphic malware. BinGraph leverages the semantics of malware, since the mutant malware is able to manipulate their syntax only. To this end, we first extract API calls from malware and convert to a hierarchical behavior graph that represents with identical 128 nodes based on the semantics. Later, we extract unique subgraphs from the hierarchical behavior graphs as semantic signatures representing common behaviors of a specific malware family. To evaluate BinGraph, we analyzed a total of 827 malware samples that consist of 10 malware families with 1,202 benign binaries. Among the malware, 20% samples randomly chosen from each malware family were used for extracting semantic signatures, and rest of them were used for assessing detection accuracy. Finally, only 32 subgraphs were selected as the semantic signatures. BinGraph discovered malware variants with 98% of detection accuracy.