Skip to Main Content
In this paper, an adaptive recognition model (ARM) is proposed for image annotation. The ARM consists of an adaptive classification network (CFN) and a nonlinear correlation network (CLN). The adaptive CFN aims to annotate an image with keywords, and the CLN is used to unveil the correlative information of keywords for annotation refinement. Image annotation is carried out by an ARM in two stages. In the first stage, the features extracted from regions of the input image are fed to a CFN to produce classification labels. In the second stage, the CLN uses keyword correlations learned from the training images to refine the classification result. The ARM works in a forward-propagating manner, resulting in high efficiency in image annotation. Furthermore, the computational time of an ARM is insensitive to the number of regions of the input image and the vocabulary size. In this paper, the effect of keyword correlation in image annotation is, comprehensively, investigated on a real image dataset and a synthetic image dataset. The exploitation of a controllable synthetic dataset helps to systematically study the function of keyword correlation and effectively analyze the performance of the ARM. Experimental results demonstrate the efficiency and effectiveness of the ARM.