Skip to Main Content
A number of methods have been recently proposed to highlight salient regions in images and videos. Considering the importance of attention in video quality evaluation, it would be useful to know how accurate these methods are in terms of predicting viewers' gaze locations in video. However, independent quantitative evaluations of saliency methods are lacking in the current literature. In this paper, we test nine different bottom-up saliency detection models on a set of standard video sequences. The eye-tracking data from 15 viewers for the first and second viewings of a sequence is evaluated against the normalized saliency maps obtained for each frame of the sequence. An accuracy score is determined for each frame and averaged across all frames to provide a measure of performance. For each sequence, the scores of all methods are compared and analyzed statistically to determine if there is a clear winner for that sequence. Further analysis and discussion of the performance of various methods is provided in an attempt to discover which aspects of the saliency models lead to high gaze prediction accuracy.