We present core elements of Samhita, a new user level software distributed shared memory (DSM) system. Our work is motivated by two observations. First, the rise of many-core architectures is producing a growing emphasis on threaded codes to achieve performance. Second, architectural trends, especially in high performance interconnects, suggest a new look at overcoming the bottlenecks that have hindered DSM performance. Samhita leverages the capabilities of remote direct memory access (RDMA) interconnects, and views the problem of providing a shared global address space as a cache management problem. Performance results on two 256 processor clusters demonstrate scalability on micro benchmarks and two real applications. The results are the largest scale tests and achieve the highest performance of any DSM system reported to date.
Published in:
Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on
Date of Conference: 7-9 Dec. 2011