All-to-all personalized communication commonly occurs in many important parallel algorithms, such as FFT and matrix transpose. This paper presents new algorithms for all-to-all personalized communication or complete exchange in multidimensional torus- or mesh-connected multiprocessors. For an R×C torus or mesh where R⩽C, the proposed algorithms have time complexities of O(C) message startups and O(RC2) message transmissions. The algorithms for three- or higher-dimensional tori or meshes follow a similar structure. Unlike other existing message-combining algorithms in which the number of nodes in each dimension should be a power-of-two and square, the proposed algorithms accommodate non-power-of-two tori or meshes where the number of nodes in each dimension need not be power-of-two and square. In addition, destinations remain fixed over a larger number of steps in the proposed algorithms, thus making them amenable to optimizations. Finally, the data structures used are simple, hence making substantial savings of message-rearrangement time
Published in:
Parallel and Distributed Systems, IEEE Transactions on
(Volume:12
,
Issue:
1
)
Date of Publication: Jan 2001