By Topic

A DSM architecture for a parallel computer Cenju-4

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Hosomi, T. ; C&C Media Res. Labs., NEC Corp., Kawasaki, Japan ; Kanoh, Y. ; Nakamura, M. ; Hirose, T.

A parallel computer Cenju-4 is a cache-coherent non-uniform memory access (ccNUMA) multiprocessor and designed to be scalable up to 1024 nodes. For scalability, Cenju-4 adopts a bit-pattern directory. This scheme enables more precise representation than other imprecise schemes, such as a coarse vector scheme. Cenju-4 utilizes multicast and gathering functions of the network for delivering invalidation request messages and for collecting replies. This enables store access latency to be scalable, even when the block is shared among all nodes. Cenju-4 also prevents starvation and deadlock by queuing certain types of messages in the main memory. This enables a full solution to the starvation problem with centralized directory scheme, and to the deadlock problem with one physical or virtual network. The buffer sizes required for queuing messages at each node are only 32K bytes and two 64K bytes on a 2024-node system. In this paper, we present the design of the DSM architecture and some performance results

Published in:

High-Performance Computer Architecture, 2000. HPCA-6. Proceedings. Sixth International Symposium on

Date of Conference: