Skip to Main Content
This paper describes the architecture and the evaluation results of a parallel computer Cenju-4. Cenju-4 supports two memory architectures: distributed memory with user-level message passing communication and distributed shared memory with cache-coherent nonuniform memory access (cc-NUMA) feature. Cenju-4 system consists of from 8 to 1024 nodes connected by a multistage network which has multicast, synchronization, and gather functions. Cenju-4 adopts a Mach micro kernel based operating system, which provides several services for parallel processing. We attained 5.5 psec communication latency and 168 Mbytes/sec communication throughput an message passing communication.