Visual context plays an important role in humans' top-down gaze movement control for target searching. Exploring the mental development mechanism in terms of incremental visual context encoding by population cells is an interesting issue. This paper presents a biologically inspired computational model. The visual contextual cues were used in this model for top-down eye-motion control on searching targets in images. We proposed a population cell coding mechanism for visual context encoding and decoding. The model was implemented in a neural network system. A developmental learning mechanism was simulated in this system by dynamically generating new coding neurons to incrementally encode visual context during training. The encoded context was decoded with population neurons in a top-down mode. This allowed the model to control the gaze motion to the centers of the targets. The model was developed with pursuing low encoding quantity and high target locating accuracy. Its performance has been evaluated by a set of experiments to search different facial objects in a human face image set. Theoretical analysis and experimental results show that the proposed visual context encoding algorithm without weight updating is fast, efficient and stable, and the population-cell coding generally performs better than single-cell coding and k-nearest-neighbor (k-NN)-based coding.