Skip to Main Content
Summary form only given. We propose a new distribution scheme for a parallel Strassen's matrix multiplication algorithm on heterogeneous clusters. In the heterogeneous clustering environment, appropriate data distribution is the most important factor for achieving maximum overall performance. However, Strassen's algorithm reduces the total operation count to about 7/8 times per one recursion and, hence, the recursion level has an effect on the total operation count. Thus, we need to consider not only load balancing but also the recursion level in Strassen's algorithm. Our scheme achieves both load balancing and reduction of the total operation count. As a result, we achieve a speedup of nearly 21.7% compared to the conventional parallel Strassen's algorithm in a heterogeneous clustering environment.