Skip to Main Content
A robust, multi-frame, progressive refinement framework for registering narrow field of view video to reference imagery is presented. A major strength of the approach is its effectiveness in the presence of dissimilar video and reference image appearance. Normalized oriented energy image pyramids are employed to enable alignment of images with global visual dissimilarities, yet local feature commonality. Local matching is then applied coarse-to-fine, along four dimensions: spatial frequency, local support, search range, and model order (a robust parametric model fit is used to reject outliers at each iteration). Globally optimal multi-frame alignment is obtained with respect to several constraints: frame-to-reference local matches, recovered frame-to-frame motion, and optional a priori estimates of sensor pose. The framework is described in detail and applied to two examples: aerial video to geographic reference image alignment (georegistration) and retinal slit lamp video to fundus image alignment.