By Topic

Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Chunjiang J. Duanmu ; Concordia Univ., Montreal ; M. Omair Ahmad ; M. N. S. Swamy

In order to take advantage of the byte-type data parallelism in the existing single-instruction multiple-data (SIMD) technique, this paper introduces the concept of 8-bit partial sums, obtained by a 4-bit right-shift operation on the sum of the 16 luminance values in a column of a 16 x 16 block of a video frame. Since these partial sums are of only eight bits, eight of them can be processed concurrently in a single 64-bit SIMD register. A method of employing these partial sums in order to speed up a given block motion-estimation algorithm is then proposed. The notion of the 8-bit partial sums is extended to the four-level case. It is shown that there are 15 possible methods of utilizing these multilevel 8-bit partial sums to accelerate a block motion-estimation algorithm without any loss of accuracy of the algorithm. Each of these 15 methods is used in the full-search algorithm to determine the one that provides the lowest computational complexity. This method is adopted as the chosen scheme to accelerate various block motion-estimation algorithms. Extensive simulations are carried out on eight video sequences showing that substantial speed-up can be achieved when the chosen scheme is incorporated with the various motion-estimation algorithms. The simulation results also demonstrate that the implementation on SIMD architectures can further accelerate the execution of the proposed scheme by more than 93% percent.

Published in:

IEEE Transactions on Circuits and Systems for Video Technology  (Volume:17 ,  Issue: 8 )