By Topic

Discriminative Latent Models for Recognizing Contextual Group Activities

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Tian Lan ; Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada ; Yang Wang ; Weilong Yang ; Robinovitch, S.N.
more authors

In this paper, we go beyond recognizing the actions of individuals and focus on group activities. This is motivated from the observation that human actions are rarely performed in isolation; the contextual information of what other people in the scene are doing provides a useful cue for understanding high-level activities. We propose a novel framework for recognizing group activities which jointly captures the group activity, the individual person actions, and the interactions among them. Two types of contextual information, group-person interaction and person-person interaction, are explored in a latent variable framework. In particular, we propose three different approaches to model the person-person interaction. One approach is to explore the structures of person-person interaction. Differently from most of the previous latent structured models, which assume a predefined structure for the hidden layer, e.g., a tree structure, we treat the structure of the hidden layer as a latent variable and implicitly infer it during learning and inference. The second approach explores person-person interaction in the feature level. We introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behavior of other people nearby. The third approach combines the above two. Our experimental results demonstrate the benefit of using contextual information for disambiguating group activities.

Published in:

Pattern Analysis and Machine Intelligence, IEEE Transactions on  (Volume:34 ,  Issue: 8 )