Skip to Main Content
Large latencies over WAN will remain an obstacle to running communication intensive parallel applications on grid environments. This paper takes one of such applications, Gaussian elimination of dense matrices and describes a parallel algorithm that is highly tolerant to latencies. The key technique is a pivoting strategy called batched pivoting, which requires much less frequent synchronizations than other methods. Although it is one of relaxed pivoting methods that may select other pivots than the 'best' ones, we show that it achieves good numerical accuracy. Through experiments with random matrices of the sizes of 64 to 49,152, batched pivoting achieves comparable numerical accuracy to that of partial pivoting. We also evaluate parallel execution speed of our implementation and show that it is much more tolerant to latencies than partial pivoting.