By Topic

Data-Driven Multithreading Using Conventional Microprocessors

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Kyriacou, C. ; Comput. Sci. & Eng. Dept., Frederick Inst. of Technol. ; Evripidou, P. ; Trancoso, P.

This paper describes the data-driven multithreading (DDM) model and how it may be implemented using off-the-shelf microprocessors. Data-driven multithreading is a nonblocking multithreading execution model that tolerates internode latency by scheduling threads for execution based on data availability. Scheduling based on data availability can be used to exploit cache management policies that reduce significantly cache misses. Such policies include firing a thread for execution only if its data is already placed in the cache. We call this cache management policy the CacheFlow policy. The core of the DDM implementation presented is a memory mapped hardware module that is attached directly to the processor's bus. This module is responsible for thread scheduling and is known as the thread synchronization unit (TSU). The evaluation of DDM was performed using simulation of the data-driven network of workstations (D2NOW). D2NOW is a DDM implementation built out of regular workstations augmented with the TSU. The simulation was performed for nine scientific applications, seven of which belong to the SPLASH-2 suite. The results show that DDM can tolerate well both the communication and synchronization latency. Overall, for 16 and 32-node D2NOW machines the speedup observed was 14.4 and 26.0, respectively

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:17 ,  Issue: 10 )