Skip to Main Content
String matching plays a key role in web content monitoring systems. Suffix matching algorithms have good time efficiency, and thus are widely used. These algorithms require that all patterns in a set have the same length. When the patterns cannot satisfy this requirement, the leftmost characters, m being the length of the shortest pattern, are extracted to construct the data structure. We call such m-character strings partial strings. However, a simple extraction from the left does not address the impact of partial string locations on search speed. We propose a novel method to extract the partial strings from each pattern which maximizes search speed. More specifically, with this method we can compute all the corresponding searching time cost by theoretical derivation, and choose the location which yields an approximately minimal search time. We evaluate our method on two rule sets: Snort and ClamAV. Experiments show that in most cases, our method achieves the fastest searching speed in all possible locations of partial string extraction, and is about 5%-20% faster than the alternative methods.
Date of Conference: 10-15 April 2011