Skip to Main Content
A real-time 2D gesture tracking and recognition system based on micro motion and vision sensors fusion is presented in this paper. The 100 Hz inertial data from MEMS sensors and 5 Hz visual data from a CMOS camera (a small webcam) are fused by using extended Kalman filter (EKF). To synchronize data from the two sensing components (i.e., inertial sensor and vision sensor) and filter out high frequency noise, the inertial data are processed by passing it through a moving average filter in the data preprocessing stage. Moreover, features of a checkerboard image are extracted by using the Harris corner extraction algorithm, and the pose the webcam relative to the world coordinate could be thus estimated by using direct linear transform (DLT). In order to avoid the gimbal lock problem, rotation quaternion is preferred to represent the orientation of the overall system. For trajectory feature extraction, the obtained trajectories are first normalized to a 0.5×0.5 area through a linear normalization, and then the features are extracted by using a linear Discrete Cosine Transform (DCT). The extracted data are subsequently sent to a dynamic time warping (DTW) algorithm for recognition. The experimental results show that 92.3% accuracy can be achieved for recognizing hand written numbers from 0 to 9, where ~100ms is required for recognition if hand writing speed is ~137 mm/s.