By Topic

A progressive approach to handling message-dependent deadlock in parallel computer systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Yong Ho Song ; SMART Interconnects Group, Univ. of Southern California, Los Angeles, CA, USA ; Pinkston, T.M.

Handling deadlocks is essential for providing reliable communication paths between processing nodes in parallel computer systems. The existence of multiple message types and associated inter-message dependencies may cause message-dependent deadlocks in networks that are designed to be free of routing deadlock. Most methods currently used for dealing with message-dependent deadlocks require more system resources than are necessary and/or do not use system resources efficiently. This may have an adverse effect on system performance if resources are scarce. In this paper, we characterize the frequency of message-dependent deadlocks in multiprocessor/multicomputer systems. We also propose a handling technique for message-dependent deadlocks based on progressive deadlock recovery and evaluate its performance with other approaches. Results show that message-dependent deadlocks occur very infrequently under typical circumstances thus, rendering approaches based on avoiding them overly restrictive in the common case. The proposed technique relaxes restrictions considerably, allowing the routing of packets and the handling of message-dependent deadlocks to be much more efficient-particularly when network resources are scarce.

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:14 ,  Issue: 3 )