Skip to Main Content
Fractional motion estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while preserving high video quality. The full-search FME algorithm, however, is computationally expensive and can consist of over 45% of the total motion estimation process. To maximise the performance and efficiency of FME implementations on field-programmable gate arrays (FPGAs), one needs to efficiently exploit the inherent parallelism in the algorithm. The authors investigate the scalability of the full-search FME algorithm on FPGAs and also implemented six scaled versions of the algorithm on Xilinx Virtex-5 FPGAs. The authors found that scaling the algorithm vertically within a 4 × 4 sub-block is more efficient than scaling horizontally across several sub-blocks. It is shown that, with four reference frames, the best vertically scaled design can achieve 96 frames-per-second (fps) performance while encoding full 1920 ×1088 progressive HDTV video, and the design only consumes 25.5 K LUTS and 28.7 K registers.