Skip to Main Content
Preliminary results of a concise investigation of the performance benefits obtained by exploiting thread and data parallelism in fast motion estimation algorithms in MPEG-2 are presented. Thirteen such fast ME algorithms were implemented using both thread-parallel and data-parallel schemes to determine their computational requirements in an embedded environment. The results are then compared to both the default (non-parallelised, full-search) as well as their respective (non-parallelised, fast) versions. Results conclusively demonstrate that both thread and data level parallelism should be exploited for cases where full-search motion estimation is a requirement. By contrast, all fast methods demonstrate that wide data-parallel hardware provides little performance improvement over a conservative, 4-byte single-instruction, multiple-data (SIMD) sum-of-absolute-differences (SAD) coprocessor. In the context of portable, consumer applications, both sets of results strongly suggest a multi-core approach with moderate data-parallel infrastructure.