Numerous variations of configurable caches, having variable parameters like total size, line size, and associativity, have been proposed in commercial microprocessors in recent years. Tuning a configurable cache to a target application has been shown to reduce memory-access power by over 50%. However, searching the configuration space for the best configuration can require much time or power, even when using recent cache tuning heuristics. We sought to determine, for a particular domain of applications, the smallest subset of cache configurations that would still enable effective tuning. For a suite of 34 benchmarks and a cache with 18 possible configurations, we determine through an exhaustive search of all possible subsets, that only 3 or 4 candidate configurations are necessary to support tuning. We introduce a new heuristic, adapted from an efficient and effective heuristic developed for data mining, to quickly determine the best configurations for any sized subset, with near optimal results. We then consider a configurable cache with 17,640 possible configurations and improve our heuristic to include a pre-pruning step, yielding near optimal tuning results. We conclude that only 3 or 4 possible cache configurations are needed to offer a near optimal configuration for every benchmark in our suite - resulting in a 91% reduction in design space exploration time over a state-of-the-art cache tuning heuristic
Published in:
Design Automation Conference, 2006 43rd ACM/IEEE
Date of Conference: 0-0 0