By Topic

A Simple Performance Model for Multithreaded Applications Executing on Non-uniform Memory Access Computers

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Yang, R. ; Sch. of Comput. Sci., Australian Nat. Univ., Canberra, ACT, Australia ; Antony, J. ; Rendell, A.P.

In this work, we extend and evaluate a simple performance model to account for NUMA and bandwidth effects for single and multi-threaded calculations within the Gaussian 03 computational chemistry code on a contemporary multi-core, NUMA platform. By using the thread and memory placement APIs in Solaris, we present results for a set of calculations from which we analyze on-chip interconnect and intra-core bandwidth contention and show the importance of load-balancing between threads. The extended model predicts single threaded performance to within 1% errors and most multi-threaded experiments within 15% errors. Our results and modeling shows that accounting for bandwidth constraints within user-space code is beneficial.

Published in:

High Performance Computing and Communications, 2009. HPCC '09. 11th IEEE International Conference on

Date of Conference:

25-27 June 2009