Skip to Main Content
The paper describes Shaman, our distributed architectural simulator of shared memory multiprocessors (SMP). The simulator runs on a PC cluster that consists of multiple front-end nodes to simulate the instruction level behavior of a target multiprocessor in parallel and a back-end node to simulate the target memory system. The front-end also simulates the logical behavior of the shared memory using a software DSM (distributed shared memory) technique and generates memory references to drive the back-end. A remarkable feature of our simulator is reference filtering to reduce the amount of the references transferred from the front-end to the back-end utilizing the DSM mechanism and coherent cache simulation on the front-end. This technique and our sophisticated DSM implementation give an extraordinary performance to the Shaman simulator. We achieved 335 million and 392 million simulation clock per second for LU decomposition and FFT in SPLASH-2 kernel benchmarks respectively, when we used 16 front-end nodes to simulate a 16-way target SMP.
Date of Conference: 2002