Cart (Loading....) | Create Account
Close category search window
 

Compiler optimization of memory-resident value communication between speculative threads

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Zhai, A. ; Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA ; Colohan, C.B. ; Steffan, J.G. ; Mowry, T.C.

Efficient inter-thread value communication is essential for improving performance in thread-level speculation (TLS). Although several mechanisms for improving value communication using hardware support have been proposed, there is relatively little work on exploiting the potential of compiler optimization. Building on recent research on compiler optimization of scalar value communication between speculative threads, we propose compiler techniques for the optimization of memory-resident values. In TLS, data dependences through memory-resident values are tracked by the underlying hardware and preserved by re-executing any speculative thread that violates a dependence; however, re-execution incurs a large performance penalty and should be used only to resolve data dependences that are infrequent. In contrast, value communication for frequently-occurring data dependences must be very efficient. We propose using the compiler to first identify frequently-occurring memory-resident data dependences, then insert synchronization for communicating values to preserve these dependences. We find that by synchronizing frequently-occurring data dependences we can significantly improve the efficiency of parallel execution. A comparison between compiler-inserted and hardware-inserted memory synchronization reveals that the two techniques are complementary, with each technique benefitting different benchmarks.

Published in:

Code Generation and Optimization, 2004. CGO 2004. International Symposium on

Date of Conference:

20-24 March 2004

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.