Skip to Main Content
In this paper, we present the processor mapping technique to eliminate amount of data exchange in runtime data redistribution on symmetric matrices. The main idea of the proposed technique is to develop mathematical functions for mapping destination processors to a new sequence of processor id. The realigned order of destination processors is then used to perform data redistribution in the receiving phase. Together with a local matrix transposition scheme, interprocessor communication can be totally eliminated in runtime redistribution. The other improvement of this approach is that one does not need to compute the complicated communication sets. The indexing cost is reduced largely. The theoretical analysis shows that (p-1)/p data transmission cost can be saved for a redistribution over p×p processors grid. Experimental result also shows that the processor mapping technique provides superior improvement for runtime data redistribution.