Scheduled System Maintenance:
On Monday, April 27th, IEEE Xplore will undergo scheduled maintenance from 1:00 PM - 3:00 PM ET (17:00 - 19:00 UTC). No interruption in service is anticipated.
By Topic

Integrating multiple forms of multithreaded execution on multi-SMT systems: a study with scientific applications

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Curtis-Maury, M. ; Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA ; Tanping Wang ; Antonopoulos, C. ; Nikolopoulos, D.

Most scientific applications have high degrees of parallelism and thread-level parallel execution appears to be a natural choice for executing these applications on systems composed of SMT processors. Unfortunately, contention for shared resources limits the performance advantages of multithreading on current SMT processors, thus leading to marginal utilization of multiple hardware threads and even slowdown due to multithreading. We show, through a rigorous evaluation with hardware monitoring counters on a real multi-SMT system, that in traditionally scalable parallel applications conflicting resource requirements are - due to the high degree of resource sharing - accountable for deeply sub-optimal performance. Motivated by this observation, we investigate the use of alternative forms of multithreaded execution, including adaptive thread throttling and speculative runahead execution, to make better use of the resources of SMT processors. Alongside the evaluation, we propose new methods to integrate these techniques into the same binary to maximize performance on multi-SMTsystems. Our study shows that combining adaptive throttling and speculative precomputation with regular thread-level parallelization leads to significant performance improvements in parallel codes which suffer from inter-thread interference and contention on SMTs.

Published in:

Quantitative Evaluation of Systems, 2005. Second International Conference on the

Date of Conference:

19-22 Sept. 2005