Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 12:00 PM ET (12:00 - 16:00 UTC). We apologize for the inconvenience.
By Topic

Using Aggressor Thread Information to Improve Shared Cache Management for CMPs

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Wanli Liu ; Dept. of Electr. & Comput. Eng., Univ. of Maryland at Coll. Park, College Park, MD, USA ; Yeung, D.

Shared cache allocation policies play an important role in determining CMP performance. The simplest policy, LRU, allocates cache implicitly as a consequence of its replacement decisions. But under high cache interference, LRU performs poorly because some memory-intensive threads, or aggressor threads, allocate cache that could be more gainfully used by other (less memory-intensive) threads. Techniques like cache partitioning can address this problem by performing explicit allocation to prevent aggressor threads from taking over the cache. Whether implicit or explicit, the key factor controlling cache allocation is victim thread selection. The choice of victim thread relative to the cache-missing thread determines each cache misspsilas impact on cache allocation: if the two are the same, allocation doesn't change, but if the two are different, then one cache block shifts from the victim thread to the cache-missing thread. In this paper, we study an omniscient policy, called ORACLE-VT, that uses off-line information to always select the best victim thread, and hence, maintain the best per-thread cache allocation at all times. We analyze ORACLE-VT, and find it victimizes aggressor threads about 80% of the time. To see if we can approximate ORACLE-VT, we develop AGGRESSOR-VT, a policy that probabilistically victimizes aggressor threads with strong bias. Our results show AGGRESSOR-VT comes close to ORACLE-VTpsilas miss rate, achieving three-quarters of its gain over LRU and roughly half of its gain over an ideal cache partitioning technique. To make AGGRESSOR-VT feasible for real systems, we develop a sampling algorithm that ldquolearnsrdquo the identity of aggressor threads via runtime performance feedback. We also modify AGGRESSOR-VT to permit adjusting the probability for victimizing aggressor threads, and use our sampling algorithm to learn the per-thread victimization probabilities that optimize system performance (e.g., weighted IPC). We call this policy AGGRESSO- - Rpr-VT. Our results show AGGRESSORpr-VT outperforms LRU, UCP, and an ideal cache way partitioning technique by 4.86%, 3.15%, and 1.09%, respectively.

Published in:

Parallel Architectures and Compilation Techniques, 2009. PACT '09. 18th International Conference on

Date of Conference:

12-16 Sept. 2009