By Topic

Distribution-Based Dimensionality Reduction Applied to Articulated Motion Recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Sunita Nayak ; Photometria Inc., San Diego ; Sudeep Sarkar ; Barbara Loeding

Some articulated motion representations rely on frame-wise abstractions of the statistical distribution of low-level features such as orientation, color, or relational distributions. As configuration among parts changes with articulated motion, the distribution changes, tracing a trajectory in the latent space of distributions, which we call the configuration space. These trajectories can then be used for recognition using standard techniques such as dynamic time warping. The core theory in this paper concerns embedding the frame-wise distributions, which can be looked upon as probability functions, into a low-dimensional space so that we can estimate various meaningful probabilistic distances such as the Chernoff, Bhattacharya, Matusita, Kullback-Leibler (KL) or symmetric-KL distances based on dot products between points in this space. Apart from computational advantages, this representation also affords speed-normalized matching of motion signatures. Speed normalized representations can be formed by interpolating the configuration trajectories along their arc lengths, without using any knowledge of the temporal scale variations between the sequences. We experiment with five different probabilistic distance measures and show the usefulness of the representation in three different contexts - sign recognition (with large number of possible classes), gesture recognition (with person variations), and classification of human-human interaction sequences (with segmentation problems). We find the importance of using the right distance measure for each situation. The low-dimensional embedding makes matching two to three times faster, while achieving recognition accuracies that are close to those obtained without using a low-dimensional embedding. We also empirically establish the robustness of the representation with respect to low-level parameters, embedding parameters, and temporal-scale parameters.

Published in:

IEEE Transactions on Pattern Analysis and Machine Intelligence  (Volume:31 ,  Issue: 5 )