Skip to Main Content
The Aho-Corasick algorithm is an efficient multiple pattern matching algorithm for large scale pattern sets. However, it consumes too much memory. A new efficient space optimization algorithm (AC-Bitmap) is proposed, which is based on the data structure of the bitmap. It divides all the states in the automata into two groups by their depths in the dictionary tree of all the patterns, and reduces the deeper group's memory consumption which is retrieved less in matching. For the latter group, it also makes use of the bitmap to improve its retrieval time efficiency. Experiments indicate that the AC-Bitmap algorithm significantly reduces the memory consumption, which still keeps the time efficiency compared to the AC algorithm for random texts and literatures.