Skip to Main Content
To alleviate the data transfer communication cost among the processor elements, many of the hardware interconnects enable data transfer to be performed at the same time as computation. The assignment of tasks to processors problem is well known to be NP-complete except in a few special cases. To improve the overall performance of high performance computing (HPC), this paper develops a scheme of HPC code generator and presents a data partitioning algorithm for the efficiency of data distribution. The algorithm generates a efficient data partitions, from which the optimized assignments can be selected for reducing the processor element communications, and the complexity is greatly reduced from exponential to polynomial. The algorithm is tested and integrated in HPC tools running on CRAY-T3E, YMP, IBM Regatta, and SGI workstation.
Date of Conference: 20-22 Dec. 2008