Skip to Main Content
While higher associativities are common at L-2 or Last-Level cache hierarchies, direct-mapped and low associative caches are still used at L-1 level. Lower associativities result in higher miss rates, but have fast access times on hits. Another issue that inhibits cache performance is the non-uniformity of accesses exhibited by most applications: some sets are underutilized while others receive the majority of accesses. Higher associative caches mitigate access non-uniformities, but do not eliminate them. This implies that increasing the size of caches or associativities may not lead to proportionally improved cache hit rates. Several solutions have been proposed in the literature over the past decade to address the non-uniformity of accesses, and each proposal independently claims improvements. However, because the published results use different benchmarks and different experimental setups, it is not easy to compare them. In this paper we report a side-by-side comparison of these techniques. The conclusion of our work is that, each application may benefit from a different technique and no single scheme works universally well for all applications. Our research is investigating the use of multiple techniques within a processor core and across cores in multicore system to improve the performance of cache memory hierarchies. The study reported in this paper allows us to select best possible solutions for each running application. In this paper, we have included some preliminary results of using multiple solutions simultaneously when running multiple threads.