By Topic

Scalable multi-pipeline architecture for high performance multi-pattern string matching

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Weirong Jiang ; Ming Hsieh Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA ; Yang, Y.E. ; Prasanna, V.K.

Multi-pattern string matching remains a major performance bottleneck in network intrusion detection and anti-virus systems for high-speed deep packet inspection (DPI). Although Aho-Corasick deterministic finite automaton (AC-DFA) based solutions produce deterministic throughput and are widely used in today's DPI systems such as Snort [1] and ClamAV [2], the high memory requirement of AC-DFA (due to the large number of state transitions in AC-DFA) inhibits efficient hardware implementation to achieve high performance. Some recent work [3], [4] has shown that the AC-DFA can be reduced to a character trie that contains only the forward transitions by incorporating pipelined processing. But they have limitations in either handling long patterns or extensions to support multi-character input per clock cycle to achieve high throughput. This paper generalizes the problem and proves formally that a linear pipeline with H stages can remove all cross transitions to the top H levels of a AC-DFA. A novel and scalable pipeline architecture for memory-efficient multi-pattern string matching is then presented. The architecture can be easily extended to support multi-character input per clock cycle by mapping a compressed AC-DFA [5] onto multiple pipelines. Simulation using Snort and ClamAV pattern sets shows that a 8-stage pipeline can remove more than 99% of the transitions in the original AC-DFA. The implementation on a state-of-the-art field programmable gate array (FPGA) shows that our architecture can store on a single FPGA device the full set of string patterns from the latest Snort rule set. Our FPGA implementation sustains 10+ Gbps throughput, while consuming a small amount of on-chip logic resources. Also desirable scalability is achieved: the increase in resource requirement of our solution is sub-linear with the throughput improvement.

Published in:

Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on

Date of Conference:

19-23 April 2010