Skip to Main Content
An acoustic detector for film slates is proposed to assist a human operator with the synchronization of audio and video in post-production. To be computationally efficient, the signal analysis is restricted to time-domain features. Although the features are statistically dependent, separate classifiers are trained for each of them. The statistical dependence is taken into account during the combination of the log-likelihood ratios provided by the individual classifiers. The overall confidence in a classification is determined as a weighted sum of the individual log-likelihood ratios, where the weights depend on the correlation between the different features. Experimental results for real-world recordings from film sets show that the confidence measures allow for a fast identification of the film slates while minimizing the interference from false detections.