Skip to Main Content
This correspondence studies the time complexity of the parallel computation of the product C = A.B of two dense square matrices A, B of order n, on a class of rectangular orthogonally connected systolic arrays, which are the two-dimensional extensions of the classical pipeline scheme. Such arrays are composed of multiply-add cells without local memory, and, as C is computed, the coefficients cij move vertically, whereas aik and bkj move horizontally in opposite directions. We first introduce a combinatorial formulation of the problem. Then we show that, if the cycle-time of a multiply-add cell is taken as time unit, and if T(p, m) denotes the running time of an optimal algorithm associated with an array of size p x m, then Minpm T(p,m) = 3n -2, and the minimum value of p.m for which this bound is tight is n.n [resp. n(n + 1)] if n is odd (resp. even). When compared to the algorithms previously proposed for the class of arrays based on cells without local memory, the solutions exhibited here appear to be the best, because they are the only ones which run in time T < = 3n -2 on a network of size S < = n(n + 1).