Skip to Main Content
We estimate and track articulated human poses in sequences from a single view, real-time range sensor. We use a data driven MCMC approach to find an optimal pose based on a likelihood that compares synthesized depth images to the observed depth image. To speed up convergence of this search, we make use of bottom up detectors that generate candidate head, hand and forearm locations. Our Markov chain dynamics explore solutions about these parts and thus combine bottom up and top down processing. The current performance is 10 frames per second. We provide quantitative performance evaluation using hand annotated data. We demonstrate significant improvement over a baseline ICP approach. This algorithm is then adapted to estimate the specific shape parameters of subjects for use in tracking. In particular, limb dimensions are included in the human pose parametrization and are automatically estimated for each subject in short training sequences. Tracking performance is quantitatively evaluated using these person specific trained models.