Skip to Main Content
An interactive participant in a live musical performance requires a multitude of senses in order to perform. These senses include hearing, sight, and touch. Our long-term goal is to have our adult size humanoid robot Jaemi Hubo be an interactive participant in a live music ensemble. This work focuses on Jaemi Hubo's sense of sight as it pertains to a musical environment. Jaemi Hubo's musical awareness is increased through the use of novel musical tempo and beat tracking techniques in the absence of auditory cues through the use of computer vision and digital signal processing methods. Real time video frames of subjects moving to a regular beat is recorded using Jaemi Hubo's video capture device. For each successive video frame, an optic flow algorithm is implemented to discern the direction and magnitude of motion. A Fast Fourier Transform is then applied to obtain the spectral content of the motion data. Next, a Gaussian weight centered on the average musical tempo is applied to the normalized spectrum. The resulting maxima of the weighted spectrum is the tempo calculated from the video frames. A tempo based dynamic threshold of the first derivative of the motion data was used to find the beat location. Experiments using OpenCV, and Matlab produced accurate tracking of the true tempo and beat timing in the captured video.