By Topic

A channel caching scheme on an optical bus-based distributed architecture

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
M. Zhu ; Dept. of Electr. & Comput. Eng., Drexel Univ., Philadelphia, PA, USA ; H. Narravula ; C. Katsinis ; D. Hecht

Summary form only given. Reducing the effect of hot spots is increasingly important to gain performance out of modern processor clusters. Traditionally, compiler techniques have been used for static analysis of hot spot patterns in parallel applications. The operating system then performs the optimization to reduce the overhead of hot spots. However, hot spots cannot be avoided due to the dynamic nature of applications. We propose a new hot spot optimization scheme based on a broadcast-based optical interconnection network, the SOME-Bus, where each node has a dedicated broadcast channel to connect with other nodes without any contention. The scheme introduces additional hardware to considerably reduce the latency of hot spot request/acknowledges. Hot spots are assumed to be identifiable either through static analysis, or by a run-time profiler. Our scheme then provides a way to cache these hot spot blocks much closer to the network/channel, thereby providing a very low latency path between the input and the output queues in the network. The technique has been implemented in a SOME-Bus simulator, and verified with popular parallel algorithms like matrix-matrix multiplication. Preliminary results show that the scheme results in the reduction of completion times of applications by up to 24% over a system without channel caching.

Published in:

Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International

Date of Conference:

26-30 April 2004