By Topic

Multi-core platform for an efficient H.264 and VC-1 video decoding based on macroblock row-level parallelism

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $33
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
J. -Y. Lee ; Electronics and Telecommunication Research Institute (ETRI), Daejeon, Korea ; J. -J. Lee ; S. M. Park

In order for the video decoding processing such as H.264 and VC-1 to be effective in multi-core environments, several kinds of parallelisms must be utilised. Here, a novel parallelisation methodology, macroblock row-level parallelism (MBRLP), of video decoding is presented. The ETRI multimedia processing core (EMC) and the ETRI multi-core platform (EMP) are proposed for adopting MBRLP. In terms of the scalability and utilisation of processing cores, MBRLP has advantages over other parallelisation strategies such as frame, slice and macroblock (MB)-level parallelism. The scalability can be easily achieved by just increasing the number of processing cores and applying homogeneous software design/optimisation techniques to each EMC. Instead of employing a dynamic MB-level scheduler, a hybrid approach is used, which is a two-stage functional pipelining combined with MBRLP. The hybrid approach of combining MBRLP and de-blocking pipelining can relieve the synchronisation and inter-processor communication overheads incurred by multicore decoding systems as well as run-time scheduler's overheads. As a result, the proposed parallelisation method and architectures can boost the performance with the efficiency of 83%. The proposed architecture consisting of six EMC clusters has the capability to process Dl (720 ?? 480) 30 fps real-time decoding at around 200 MHz. The same concept can be applied to full-HD (1920 ?? 1088) video decoding in this work. It can be found that as the number of processing cores increase, the performance improvement is enhanced almost linearly. The EMP consisting of four EMC clusters (eight cores), memories and other peripherals are prototyped on Xilinx Virtex4 XC4VL200 FPGA which is operating at 60 MHz.

Published in:

IET Circuits, Devices & Systems  (Volume:4 ,  Issue: 2 )