This paper presents novel architectures for efficient implementations of matrix product using an FPGA based parameterizable system. These operations are important in many signal and image processing applications including image and speech compression, filtering, coding and beamforming. Two novel architectures for matrix multiplication using both systolic architecture and distributed arithmetic design methodologies are presented. The first approach uses the Baugh-Wooley algorithm for a systolic architecture implementation. The second approach Is based on both distributed arithmetic ROM and accumulator structure. Implementations of the algorithms on a Xilinx FPGA board are described. Distributed arithmetic approach exhibits better performances when compared with the systolic architecture approach.
Published in:
Signal Processing Systems, 2002. (SIPS '02). IEEE Workshop on
Date of Conference: 16-18 Oct. 2002