By Topic

Applying CUDA Architecture to Accelerate Full Search Block Matching Algorithm for High Performance Motion Estimation in Video Encoding

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Monteiro, E. ; Inf. Inst., Fed. Univ. of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil ; Vizzotto, B. ; Diniz, C. ; Zatt, B.
more authors

This work presents a parallel GPU-based solution for the Motion Estimation (ME) process in a video encoding system. We propose a way to partition the steps of Full Search block matching algorithm in the CUDA architecture. A comparison among the performance achieved by this solution with a theoretical model and two other implementations (sequential and parallel using OpenMP library) is made as well. We obtained a O(n^2/log^2n) speed-up which fits the proposed theoretical model considering different search areas. It represents up to 600x gain compared to the serial implementation, and 66x compared to the parallel OpenMP implementation.

Published in:

Computer Architecture and High Performance Computing (SBAC-PAD), 2011 23rd International Symposium on

Date of Conference:

26-29 Oct. 2011