Skip to Main Content
This paper describes a depth image based real-time skeleton fitting algorithm for the hand, using an object recognition by parts approach, and the use of this hand modeler in an American Sign Language (ASL) digit recognition application. In particular, we created a realistic 3D hand model that represents the hand with 21 different parts. Random decision forests (RDF) are trained on synthetic depth images generated by animating the hand model, which are then used to perform per pixel classification and assign each pixel to a hand part. The classification results are fed into a local mode finding algorithm to estimate the joint locations for the hand skeleton. The system can process depth images retrieved from Kinect in real-time at 30 fps. As an application of the system, we also describe a support vector machine (SVM) based recognition module for the ten digits of ASL based on our method, which attains a recognition rate of 99.9% on live depth images in real-time1.
Date of Conference: 6-13 Nov. 2011