By Topic

Evaluation of Techniques to Improve Cache Access Uniformities

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Nwachukwu, I. ; Univ. of North Texas, Denton, TX, USA ; Kavi, K. ; Ademola, F. ; Yan, C.

While higher associativities are common at L-2 or Last-Level cache hierarchies, direct-mapped and low associative caches are still used at L-1 level. Lower associativities result in higher miss rates, but have fast access times on hits. Another issue that inhibits cache performance is the non-uniformity of accesses exhibited by most applications: some sets are underutilized while others receive the majority of accesses. Higher associative caches mitigate access non-uniformities, but do not eliminate them. This implies that increasing the size of caches or associativities may not lead to proportionally improved cache hit rates. Several solutions have been proposed in the literature over the past decade to address the non-uniformity of accesses, and each proposal independently claims improvements. However, because the published results use different benchmarks and different experimental setups, it is not easy to compare them. In this paper we report a side-by-side comparison of these techniques. The conclusion of our work is that, each application may benefit from a different technique and no single scheme works universally well for all applications. Our research is investigating the use of multiple techniques within a processor core and across cores in multicore system to improve the performance of cache memory hierarchies. The study reported in this paper allows us to select best possible solutions for each running application. In this paper, we have included some preliminary results of using multiple solutions simultaneously when running multiple threads.

Published in:

Parallel Processing (ICPP), 2011 International Conference on

Date of Conference:

13-16 Sept. 2011