By Topic

Exploiting Parallelism of MPEG-4 Decoder with Dataflow Programming on Multicore Processor

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Zhong-Ho Chen ; Dept. of CSIE, Nat. Cheng-Kung of Univ., Tainan, Taiwan ; Ta-Chun Chen ; Jung-Yin Chien ; Alvin Su
more authors

Multicore processor provides large computation capability but also involves the complicate parallel programming. One of major considerations in parallel programming is the performance. Traditional design methodologies which usually start a design on a selected platform spend a lot of effort and time on tuning performance and debugging. When platform is changed even with different number of cores, considerable redesign effort is required. Hence a flexible design methodology is necessary. In this paper, a design methodology is presented for video codec, by using MPEG-4 SP decoder as an example, on multicore processor. The parallelisms of MPEG-4 decoder are discussed and exposed with the dataflow model. The dataflow model provides a high-level abstraction of underlying hardware. Computation and communication of MPEG-4 decoder are separated and represented as modules and channels, respectively. It is possible to synthesize the model targeting to either dedicate hardware or software on multiprocessor. To map the high level dataflow model to Cell processor, the mapping flow, including offline profiling, task allocation and runtime libraries, are developed. According to the profiling results, the allocation algorithm could allocate tasks on multiprocessors as balanced as possible. An efficient synchronization mechanism on Cell processor is also proposed. We also discuss the impact of the model and the mapping flow corresponding to decoding speed. The results show that the proposed methodology gets considerable performance boost when the number of cores is increased.

Published in:

International Symposium on Parallel and Distributed Processing with Applications

Date of Conference:

6-9 Sept. 2010