Historically, scientific computing applications have been statically linked before running on massively parallel High Performance Computing (HPC) platforms. In recent years, demand for supporting dynamically linked applications at large scale has increased. When programs running at large scale dynamically load shared objects, they often request the same file from shared storage. These independent requests tax the shared storage and the network, causing a significant delay in computation time. In this paper, we propose to leverage a proven file sharing technique, Bit Torrent, abstracted by an on-node FUSE interface to create a system-level distribution method for these files. We detail our proposed methodology, related work, and our current progress.
Published in:
Parallel Processing Workshops (ICPPW), 2012 41st International Conference on
Date of Conference: 10-13 Sept. 2012