By Topic

Optimization Strategies for High-Performance Computing of Optical-Flow in General-Purpose Processors

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Anguita, M. ; Dept. of Comput. Archit. & Technol., Univ. of Granada, Granada, Spain ; Diaz, J. ; Ros, E. ; Fernandez-Baldomero, F.J.

In this paper, we describe the high-performance implementation of an optical-flow algorithm that takes advantage of the processor's architecture. Tuning the code, i.e., adapting it to take full advantage of the processor, is challenging, time consuming, and requires efficient programming at different levels but can lead to significant improvements in performance. The optimized implementation presented here is highly interesting for a number of applications since it delivers real-time motion estimations at high-image resolution on a PC or in an embedded system based on a general-purpose processor. In a 2.83 GHz Core 2 Quad PC, it achieves a speedup of 14 compared to our first code version and 2052.7f/s for the well-known 252 times 316 Yosemite sequence, and a speedup of 17.6 and 68.5 f/s for a 1016 times 1280 sequence. But the description of how this high-performance is achieved goes beyond a specific application since the paper presented here illustrates how inherently dense, low-level visual algorithms (pixel-wise computation) can be structured and improved to take full advantage of a standard processor. The implementation is compared with other hardware (based on FPGAs and GPUs) and software (based on clusters, PCs, and special-purpose processors) optical-flow implementations, showing that it outperforms them.

Published in:

Circuits and Systems for Video Technology, IEEE Transactions on  (Volume:19 ,  Issue: 10 )