Skip to Main Content
We analyze a ring algorithm for the computation of long-range interactions, and a modified version, which uses a box-assisted approach with linked lists, for the computation of short-range interactions. The general problem is exemplified considering the computation and the histogram of distances between points in a set. The algorithms, originally developed for homogeneous parallel systems, where they yield a nearly linear speed-up, are moved to heterogeneous systems (e.g. NOW). The main part of our work analyzes performance obtainable on such systems using a virtual ring of processes and assigning to each node a number of processes proportional to its relative speed. Following our analysis, we implemented a computer simulator which allows to investigate some interesting properties of ring algorithms and to predict with a good accuracy the experimental results. Simulations and trials show that the use of multiple processes per node greatly reduces load unbalancing, which is the major cause of performance loss, without a significant context switching overhead, allowing good performance even on highly heterogeneous systems. The short-range interaction problem is interesting, since by varying the neighbour size we are able to vary the computation to communication ratio of the algorithms The proposed analysis is general and applies to any regular data-parallel ring-based application.
Date of Conference: 5-7 Feb. 2003