By Topic

Extensive articulated human detection by voting Cluster Boosted Tree

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Bo Yang ; Inst. for Robot. & Intell. Syst., Univ. of Southern California, Los Angeles, CA, USA ; Chang Huang ; Nevatia, R.

Our goal is to detect people in highly articulated poses, including bending, crouching, etc. Such formidable diversity in human poses makes detection much more difficult than for pedestrian poses. ¿Divide-and-conquer¿ is a favorable strategy for detecting objects with large intra class variations, which splits object instances into several subcategories and trains relatively simple classifiers for each sub-category. We propose a novel sample split method, which benefits the learning results of articulated humans. We adopt the cluster boosted tree (CBT) structure to automatically decide when a split should be triggered. Unlike the simple k-means used in CBT for sample split, our approach aims at minimizing the training loss after the split. Since this minimization is an NP-hard problem, we design a heuristic algorithm, in which we find optimal sample divisions according to each single feature, and then make compromises to get a final division by a voting-like process. We name our training method as voting cluster boosted tree (VCBT). Furthermore, to avoid large background area in training samples, we first cluster samples according to their width/height ratios, and then train a VCBT for each subset. We conduct an experiment on 17 infrared surveillance video clips, report superior performance compared with previous human detection methods, and show how our approach benefits the learning results by reducing training loss.

Published in:

Applications of Computer Vision (WACV), 2009 Workshop on

Date of Conference:

7-8 Dec. 2009