Skip to Main Content
Network forensic analysts employ payload attribution systems (PAS) as an investigative tool, which enables them to store and summarize large amounts of network traffic, including full packet payload. Hence an investigator could query the system for a specific string and check whether any of the packets transmitted previously in the network contained that specific string. As a shortcoming, the previously proposed techniques are unable to support wildcard queries. Wildcards are an important type of query that allow the investigator to locate strings in the payload when only part of the string is known. In this paper, a new data structure for payload attribution, named Character Dependent Multi-Bloom Filters, will be presented which, in addition to improving the previously proposed techniques, is able to support wildcard queries as well. To this end, a theoretical study of the proposed method was conducted in order to evaluate its false positive when responding to queries and subsequently the theoretical analysis is verified through a number of experiments. Furthermore, comparisons are made between the proposed method and the state-of-the-art attribution techniques presented in the literature. The results suggest that, using the Character Dependent Multi-Bloom Filters, one can obtain a data reduction ratio of about 265 : 1 opposed to 210 : 1 as obtained by the previously proposed state-of-the-art techniques assuming a similar false-positive rate. More importantly, the results indicate that a wildcard query with seven unknown characters would take approximately less than 1 second to process, using the proposed method; while given the previous techniques, as an exhaustive search is required, the same query takes about 4500 years to process.