Skip to Main Content
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved towards the exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoiding network hot-spots and improving scalability. Parallel simulation is a promising approach, which has been extensively used to model the performance of such large-scale machines. One of the most critical factors in coping with the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class. In this paper, we discuss the development of a network contention model for a full-system XMT simulator. We start by measuring the effects of network contention on a 128-processorXMT machine, we then investigate the trade-off that exists between simulation accuracy and speed, comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the actual machine, using three datasets that generate noticeably different contention patterns. Results prove that simulator accuracy in execution time remains within 10% of the real machine. We also show that the slowdown due to contention modeling is limited to 20%, when simulating a small number of processors, and becomes negligible for simulations with higher processor counts.