Skip to Main Content
We investigate the possibility of applying non-linear manifold learning techniques to aid in markerless human motion capturing. We hypothesize that the set of segmented binary images (in a constrained environment) of a person in all possible poses lie on a low dimensional manifold in the image space. Since it is not feasible to densely sample the manifold by capturing real life images, we propose to learn the manifold by using synthetic images. An accurate 3D mesh of the actor can be used to generate the synthetic 3 dimensional virtual data. A set of poses (a collection of hierarchical joint angles defining the stance of a person at a point in time) ranging the space of possible human motion is used to animate the mesh and the synthetic images are then captured by virtual cameras. We hypothesize that these vectorized synthetic images lie on a low dimensional manifold shared by the pose vectors. We then align the synthetic image and pose pairs to form a common manifold by constraining them to be equivalent. Given a new set of real images of the actor, the system can then project the captured image onto the aligned common manifold and determine the closest synthetic poses to use to linearly generate the output pose. Our experiments exhibit promising results for our method.