Skip to Main Content
This paper proposes a method for matching two sets of images given a small number of training examples by exploiting the underlying structure of the image manifolds. A nonlinear map from one manifold to another is constructed by combining linear maps locally defined on the tangent spaces of the manifolds. This construction imposes strong constraints on the choice of the maps, and makes possible good generalization of correspondences between all of the image sets. This map is flexible enough to approximate an arbitrary diffeomorphism between manifolds and can serve many purposes for applications. The underlying algorithm is a non-iterative efficient procedure whose complexity mainly depends on the number of matched training examples and the dimensionality of the manifold, and not on the number of samples nor on the dimensionality of the images. Several experiments were performed to demonstrate the potential of our method in image analysis and pose estimation. The first example demonstrates how images from a rotating camera can be mapped to the underlying pose manifold. Second, computer generated images from articulating toy figures are matched using the underlying 4 dimensional manifold to generate image-driven animations. Finally, two sets of actual lip images during speech are matched by their appearance manifold. In all these cases, our algorithm is able to obtain reasonable matches between thousands of large-dimensional images, with a minimum of computation.