Skip to Main Content
Reconfigurable computing systems have been built that combine FPGAs and general-purpose processors to achieve high performance. The nodes in these systems can have different compute capacities based on the processors and FPGAs within them. In this paper, we study the algorithm design on heterogeneous reconfigurable systems for two key linear algebra kernels: matrix multiplication and LU decomposition. Our designs exploit the heterogeneous nodes and utilize up to 80% of the compute capacity of the system.