By Topic

Design and VLSI Implementation of a Concurrent Solver for N-Coupled Least-Squares Fitting Problems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Jainandunsing, K. ; Delft Univ. of Tech., Delft, The Netherlands ; Deprettere, E.

Most algorithms for high,quality modeling and coding of stochastic sequences (speech or images) make extensive use of matrix operations. Because of the high computational complexity of these operations, the use of conventional implementation techniques and architecture designs would almost certainly rule out such algorithms as candidates for real-time signal processing. In this paper, we present an algorithm and its mapping on a VLSI architecture for the solution of N (n +1) by (n +1) systems of linear equations, which arise from a speech coding algorithm. The systems of equations form an ordered set of equations and they mutually exhibit rank 1 differences. This property is exploited to obtain concurrently the solution of all equations. Via an analysis of the algebraic structure of the systems of equations, we succeed in reducing the complexity to a single matrix inversion, while enhancing the regularity of the algorithm, e.g., by including the back substitution in the main factorization loop. Next, we proceed to map the algorithm on VLS1 hardware, using a very systematic hierarchical temporal/structural decomposition/ partitioning approach. To achieve high throughput, we make extensive use of pipelining and show how a pipelined CORDIC processor element supports the desired operations. The complete equation solver is build around two pipelined CORDIC processor elements and two FIFO-type memories. The solver fits on three VLSI chips of size 6.5*6.5 mm2in a standard-slow-NMOS technology. The chips are of medium complexity and the resulting floorplan is shown. The resulting architecture achieves a very high throughput with minimal dataflow-oriented hardware.

Published in:

Selected Areas in Communications, IEEE Journal on  (Volume:4 ,  Issue: 1 )