Skip to Main Content
In recent years video-basedanalysis of human motion has gained increased interest, which for a large part is due to the ongoing rapid developments of computer and camera hardware, such as increased CPU power, fast and modular interfaces and high-quality image digitisation. A similar important role is played by the development of powerful approaches for the analysis of visual data from video sources. In computer music this development is reflected in a row of applications approaching the analysis of video and image data for gestural control of music and sound, such as Eyesweb, Jitter, openCV, Gem, , , , . In this paper an approach is presented for the control of music and sound parameters through hand gestures, which are recognised through a Time-delay Neural Network (TDNN). The recognition networks were trained with appearance-based features extracted from image sequences of a video camera. Cyclic hand gestures are proposed to enable fast and seamless recording and annotation of time series as multi-state patterns. For the supervised training of the TDNNs feature maps based on spatial Fourier transformation of image sequences are proposed. The integration of the gesture recognition in an interactive music piece will be described at the end of the paper.