Skip to Main Content
Packet content scanning at high speed has become extremely important due to its applications in network security, network monitoring, HTTP load balancing, etc. In content scanning, the packet payload is compared against a set of patterns specified as regular expressions. In this paper, we first show that memory requirements using traditional methods are prohibitively high for many patterns used in packet scanning applications. We then propose regular expression rewrite techniques that can effectively reduce memory usage. Further, we develop a grouping scheme that can strategically compile a set of regular expressions into several engines, resulting in remarkable improvement of regular expression matching speed without much increase in memory usage. We implement a new DFA-based packet scanner using the above techniques. Our experimental results using real-world traffic and patterns show that our implementation achieves a factor of 12 to 42 performance improvement over a commonly used DFA- based scanner. Compared to the state-of-art NFA-based implementation, our DFA-based packet scanner achieves 50 to 700 times speedup.