Skip to Main Content
As malware becomes pervasive and fast-evolving on the Internet, every computer linking to the outer world faces the risks of malware attacks. Therefore, it is important to not only detect malware as early as possible but also to determine which computer has been attacked. Among the various methods to find and trace the existence of malware, retrospective detection is promising one. Once a threat is identified, it allows one to determine exactly which host or users open similar files by searching historical information. In the past, the huge volume of historical information represents an insurmountable barrier to such traces. Fortunately, with the evolution of cloud computing technologies, this barrier can be broken. In this paper, we propose a new retrospective detection approach based on Portable Executable (PE) format file relationships. We implement our system in a Hadoop platform and use 18 real-world malware to do effective and efficient tests. Our results show that our system has a higher detection rate as well as a lower false positive rate than the famous Splunk tool. We also find that, although cloud computing is suitable for processing a small number of huge files, it has shortcomings in dealing with a large number of small files.