By Topic

Energy-Efficient Hardware Prefetching for CMPs Using Heterogeneous Interconnects

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Antonio Flores ; Dept. de Ing. y Tecnol. de Comput., Univ. of Murcia, Murcia, Spain ; Juan L. Aragón ; Manuel E. Acacio

In the last years high performance processor designs have evolved toward Chip-Multiprocessor (CMP) architectures that implement multiple processing cores on a single die. As the number of cores inside a CMP increases, the on-chip interconnection network will have significant impact on both overall performance and power consumption as previous studies have shown. On the other hand, CMP designs are likely to be equipped with latency hiding techniques like hardware prefetching in order to reduce the negative impact on performance that, otherwise, high cache miss rates would lead to. Unfortunately, the extra number of network messages that prefetching entails can drastically increase the amount of power consumed in the interconnect. In this work, we show how to reduce the impact of prefetching techniques in terms of power (and energy) consumption in the context of tiled CMPs. Our proposal is based on the fact that the wires used in the on-chip interconnection network can be designed with varying latency, bandwidth and power characteristics. By using a heterogeneous interconnect, where low-power wires are used for dealing with prefetched lines, significant energy savings can be obtained. Detailed simulations of a 16-core CMP show that our proposal obtains improvements of up to 30% in the power consumed by the interconnect (15-23% on average) with almost negligible cost in terms of execution time (average degradation of 2%).

Published in:

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing

Date of Conference:

17-19 Feb. 2010