Skip to Main Content
PERG is an FPGA application that accelerates the process of searching a stream of bytes against a large, fixed database of string patterns. The stream could be network, disk, or file traffic, while the pattern database may represent computer viruses, spam, keyword sequences, or watermarks. A full pattern, or rule, consists of a sequence of one or more segments separated by gaps. Each segment is an exact sequence of bytes, possibly 100s of bytes long. Each gap contains arbitrary bytes, but is a known length. PERG uses a pattern compiler to transform a database of these rules into a hardware implementation. To the authorspsila knowledge, this is the first pattern match engine hardware designed for large virus databases. It is also first among network intrusion detection systems (NIDS), which are similar in nature to PERG, to implement Bloomier filters. Like hash tables, Bloomier filters produce false positives due to aliasing, so all potential matches must be verified by exact matching. However, Bloomier filters are more powerful than their ancestral Bloom filters because they can identify the exact rule of a potential match. This enables two key advantages for PERG. First, it allows PERG to use a checksum to very efficiently reduce false positives. Second, exact matching with PERG filters is much faster than with Bloom filter systems because only one suspect pattern needs to be checked, not all patterns. To reduce memory requirements, PERG packs as many segments as possible into each Bloomier filter by consolidating several different segment lengths into the same filter unit. This is done by dividing each segment into two smaller but overlapping fragments of the same length. Dividing into non-overlapping fragments would create shorter fragments of uneven lengths, leading to higher false positives and differing lengths to consolidate later. Using the ClamAV antivirus database, PERG fits 80,282 patterns containing over 8,224,848 characters into a single modest FPGA chi- - p with a small (4 MB) off-chip memory. It uses just 26 filter units, resulting in roughly 26x improved density (characters per memory bit) compared to the next-best NIDS pattern match engine which fits only 1/250th the characters. PERG can scan at roughly 200 MB/s and match the speed of most network or disk interfaces.