By Topic

A multi-functional dot product unit with SIMD architecture for embedded 3D graphics engine

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Yisong Chang ; Sch. of Comput. Sci. & Technol., Tianjin Univ., Tianjin, China ; Jizeng Wei ; Wei Guo ; Jizhou Sun

The floating-point 4D vector dot product (DP4) unit is the key factor to the overall performance of embedded graphics engine. In this paper, an enhanced multi-functional DP4 unit with optimized single instruction multiple data (SIMD) architecture is proposed, in which basic vector multiplication, addition and comparison in 3D graphics applications can be combined with specific dot production. To reduce area overhead, several fundamental hardware modules in the proposed DP4 unit are multiplexed and vectorized for SIMD functions instead of additional discrete floating-point multipliers and adders. The proposed methods can also avoid significant critical path delay increase so that the high performance of entire shader applications can be maintained. Normalization and rounding logics for SIMD vector addition function running in parallel with fundamental blocks and not affecting the critical path are also implemented. The synthesized result shows that the proposed multi-functional DP4 unit provides high performance with no critical path delay increase while just costs only 17% increase in area. The design can also be fully pipelined with a latency of 4 cycles and 1 cycle throughput at 193 MHz clock speed.

Published in:

Circuits and Systems (MWSCAS), 2011 IEEE 54th International Midwest Symposium on

Date of Conference:

7-10 Aug. 2011