By Topic

A Highly Parallel and Scalable CABAC Decoder for Next Generation Video Coding

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Sze, V. ; Syst. & Applic. R&D Center, Texas Instrum., Dallas, TX, USA ; Chandrakasan, A.P.

Future video decoders will need to support high resolutions such as Quad Full HD (QFHD, 4096 × 2160) and fast frame rates (e.g., 120 fps). Many of these decoders will also reside in portable devices. Parallel processing can be used to increase the throughput for higher performance (i.e., processing speed), which can be traded-off for lower power with voltage scaling. The next generation standard called High Efficiency Video Coding (HEVC), which is being developed as a successor to H.264/AVC, not only seeks to improve the coding efficiency but also to account for implementation complexity and leverage parallelism to meet future power and performance demands. This paper presents a silicon prototype for a pre-standard algorithm developed for HEVC (“H.265”) called Massively Parallel CABAC (MP-CABAC) that addresses a key video decoder bottleneck. A scalable test chip is implemented in 65-nm and achieves a throughput of 24.11 bins/cycle, which enables it to decode the max H.264/AVC bit-rate (300 Mb/s) with only a 18 MHz clock at 0.7 V, while consuming 12.3 pJ/bin. At 1.0 V, it decodes a peak of 3026 Mbins/s for a bit-rate of 2.3 Gb/s, enough for QFHD at 186 fps. Both architecture and joint algorithm-architecture optimizations used to reduce critical path delay, area cost and memory size are discussed.

Published in:

Solid-State Circuits, IEEE Journal of  (Volume:47 ,  Issue: 1 )