Skip to Main Content
An increasing number of architectural techniques have relied on hardware counting bloom filters (CBFs) to improve upon the energy, delay, and complexity of various processor structures. CBFs improve the energy and speed of membership tests by maintaining an imprecise and compact representation of a large set to be searched. This paper studies the energy, delay, and area characteristics of two implementations for CBFs using full custom layouts in a commercial 0.13-mum fabrication technology. One implementation, S-CBF, uses an SRAM array of counts and a shared up/down counter. Our proposed implementation, L-CBF, utilizes an array of up/down linear feedback shift registers and local zero detectors. Circuit simulations show that for a 1 K-entry CBF with a 15-bit count per entry, L-CBF compared to S-CBF is 3.7times or 1.6times faster and requires 2.3times or 1.4times less energy depending on the operation. Additionally, this paper presents analytical energy and delay models for L-CBF. These models can estimate energy and delay of various CBF organizations during architectural level explorations when a physical level implementation is not available. Our results demonstrate that for a variety of L-CBF organizations, the estimations by analytical models are within 5% and 10% of Spectre simulation results for delay and energy, respectively.