Skip to Main Content
The unfolded and pipelined CORDIC is a high-performance hardware element that produces a wide variety of one and two argument functions with high throughput. The reduction in delay, power, and area (cost) are of significant interest regarding this module due to its high demand for resources. The linear approximation to rotation has been proposed to achieve such reductions. However, the schemes for rotation (multiplication) and vectoring (division) complicate the implementation in a single unit. In this work, we improve the linear approximation scheme, leading to a unified implementation for rotation and vectoring, where fully parallel tree multipliers are used instead of the second half of CORDIC iterations. We also combine the linear approximation to rotation with the scale factor compensation so that the compensation is concurrently performed with the rotation process. We then extend the method to 3D CORDIC. Such an extension is not straightforward due to the lack of existing analytical expressions for the convergence of the algorithm. A comparison, using a rough area-time model and synthesis results, shows that our proposals may achieve significant reductions in delay, with no increase in area, in actual implementations.