A programmable systolic array cell for signal processing applications is described. The cell uses two chips: the 16-b NCR45CM16 CMOS multiplier/accumulator (MAC) for arithmetic, and the systolic array controller (SAC) for routing data and controlling the MAC. All major cell resources can operate concurrently. The many practical details of implementing systolic array algorithms on an array of SAC/MAC cells are fully presented. A library of macros for commonly used program segments is described. Key issues are discussed such as programming the MAC, scaling operands, loading RAM, synchronizing cells, delaying data, unloading results, combining the macros into a program, and pipelining a program. Two systolic algorithms are developed: matrix multiplication on a linear array, and matrix multiplication on a two-dimensional array. With a two-dimensional array, a series of pipelined matrix-matrix multiplications uses the MAC every cycle
Published in:
Acoustics, Speech and Signal Processing, IEEE Transactions on
(Volume:38
,
Issue:
7
)
Date of Publication: Jul 1990