By Topic

Efficient architecture and design of an embedded video coding engine

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Chen, J. ; Flarion Technol., Bedminster, NJ, USA ; Liu, K.J.R.

By doubling the accuracy of motion compensation from integer-pel to half-pel, we can significantly improve the coding gain. Therefore, in this paper, we propose a novel COordinate Rotation Digital Computer (CORDIC) architecture for combined design of discrete cosine transform (DCT) and half-pel motion estimation. Unlike the conventional block matching approaches based on interpolated images, our CORDIC design can directly extract motion vectors at half-pel accuracy in the transform domain without interpolation. Compared to the conventional block matching methods with interpolation, our multiplier-free design achieves significant hardware saving and far less data flow. Our emphasis in this paper is on achieving efficient design of video coding engine by minimizing computational units along the data path. Furthermore, we implement the embedded design on a dedicated single chip to demonstrate its performance. The DCT-based nature of our design enables us to efficiently combine both DCT and motion estimation units, which are the two most important components of many multimedia standards consuming more than 80% of computing power for a video coder, into one single component. As a result, we can provide a single chip solution for video coding engine while many conventional designs may require multiple chips. In addition, all multiply-and-add (MAC) operations in plane rotations are replaced by CORDIC processors with simple shift-and-add, which is quite simple and compact to realize while being no slower than the bit serial multipliers widely proposed for VLSI array structures. Based on the test result, our chip can operate at 20 MHz with 0.8-μm CMOS technology. Overall, we provide a low-complexity, high throughput solution in this paper for MPEG-1, MPEG-2, and H.263 compatible video codec design

Published in:

Multimedia, IEEE Transactions on  (Volume:3 ,  Issue: 3 )