Skip to Main Content
This chapter describes the computational visual attention models in the spatial domain, based on the bottom-up mechanism. Although there have been a large number of bottom-up computational models in the spatial domain since 1998, this chapter only discusses a few typical computational models: baseline saliency (BS) model, models based on neural networks and models based on statistical signal processing theory, such as information theory (the AIM model), decision-theory (the DISC model) natural statistical (the SUN model) and Bayesian theory (the surprise detection model).
Section 3.1 introduces the major parts of the BS system, while Section 3.2 addresses the issues related to visual attention for video. These two sections aim to give the reader the most important ideas for modelling bottom-up visual attention in the spatial domain. Section 3.3 presents more details and variations of the BS model, to give the reader more insight and choices within the topic. Section 3.4 introduces an alternative solution, a graph-based approach, for determining visual attention, and we also demonstrate and discuss its difference with the BS model. Section 3.5 gives a new filter basis bank learning from natural images to extract features of the input image, which is based on information maximum, called the AIM model. Another model, referred to as DISC, which processes the centre-surround inhibition based on optimal decision theory, is introduced in Section 3.6. Then Section 3.7 presents a paradigm shift in visual attention modelling by introducing a new methodology based on comprehensive statistics from a large number of natural images, rather than the current test image (as used in the models in Sections 3.1 to 3.6). Section 3.8 presents a surprise detection model to test the saliency location, based on Bayesian theory.