Loading [MathJax]/extensions/MathMenu.js
Image Retrieval Using Convolutional Autoencoder, InfoGAN, and Vision Transformer Unsupervised Models | IEEE Journals & Magazine | IEEE Xplore

Image Retrieval Using Convolutional Autoencoder, InfoGAN, and Vision Transformer Unsupervised Models


Image retrieval using ViT model.

Abstract:

Query by Image Content (QBIC), subsequently known as Content-Based Image Retrieval (CBIR), offers an advantageous solution in a variety of applications, including medical...Show More

Abstract:

Query by Image Content (QBIC), subsequently known as Content-Based Image Retrieval (CBIR), offers an advantageous solution in a variety of applications, including medical, meteorological, search by image, and other applications. Such CBIR systems primarily use similarity matching algorithms to compare image content to get matched images from datasets. They essentially measure the spatial distance between extracted visual features from a query image and its similar versions in the dataset. One of the most challenging query retrieval problems is Facial Sketched-Real Image Retrieval (FSRIR), which is based on content similarity matching. These facial retrieval systems are employed in a variety of contexts, including criminal justice. The difficulties of retrieving such sorts come from the composition of the human face and its distinctive parts. In addition, the comparison between these types of images is made within two different domains. Besides, to our knowledge, there is a few large-scale facial datasets that can be used to assess the performance of the retrieval systems. The success of the retrieval process is governed by the method used to estimate similarity and the efficient representation of compared images. However, by effectively representing visual features, the main challenge-posing component of such systems might be resolved. Hence, this paper has several contributions that fill the research gap in content-based similarity matching and retrieval. The first contribution is extending the Chinese University Face Sketch (CUFS) dataset by including augmented images, introducing to the community a novel dataset named Extended Sketched-Real Image Retrieval (ESRIR). The CUFS dataset has been extended from 100 images to include 53,000 facial sketches and 53,000 real facial images. The paper second contribution is presenting three new systems for sketched-real image retrieval based on convolutional autoencoder, InfoGAN, and Vision Transformer (ViT) unsupervised mode...
Image retrieval using ViT model.
Published in: IEEE Access ( Volume: 11)
Page(s): 20445 - 20477
Date of Publication: 02 February 2023
Electronic ISSN: 2169-3536

Funding Agency:

Citations are not available for this document.

Cites in Papers - |

References

References is not available for this document.