I. Introduction
In recent years, with the popularity of digital image devices in various areas, the demand for directly visually searching based on image has become stronger and stronger. To address the computation complexity, memory load and bandwidth limitation in visual search, the MPEG drafted the Compact Description for Visual Search [1] standard, in which image compact descriptor consists of global descriptor (GD) and local descriptor (LD) generated from selected SIFT points. CDVS is proved to achieve remarkable improvements in increasing compactness of image descriptor and reducing the computation complexity and memory demand while obtaining high MAP. However, in feature points based visual searching system, a large range of scale changes, great affine transformation and quantization distortion in descriptor extraction can cause some target objects missed in retrieval. Besides, in some complex application scenes, such as traffic vehicle retrieval, the illumination may change greatly from day to night, which leads to much image retrieval performance loss because of the huge illumination difference. All these issues prevent image retrieval from practical applications.