Skip to Main Content
Video algorithm (e.g. H.264, MPEG2/4 etc) requires tremendous amount of computation power and data bandwidth. This complexity depends on encoding vs. decoding mode, video standard, resolution, frame-rate and visual quality constraints. Many video architecture solutions typically use multiple processing elements (e.g. multiple DSPs or MCU, DSP/MCU with dedicated accelerators or FPGA etc) to achieve the high computation requirements for video algorithms. These architectures provide new challenges to video software's that are typically designed to run on a single processor. This paper presents software design for a video architecture using parallel processing elements. This paper explains following aspects in detail a) Software partitioning b) Algorithm specific optimizations c) Processor specific optimizations d) Efficient DMA/Cache usage e) Concurrent scheduling of all parallel processing elements. The given approach is explained with example of MPEG4 encoder on TMS320DM6446, which is Davincitrade family device from Texas Instruments Ltd. The given software architecture is scalable for various video standards (e.g. H.264, MPEG2/4 etc) as well as various parallel processing hardware solutions. The software achieves performance Dl@30 fsp on given device at less than 50% of DSP load.