Automatic Image and Video Caption Generation With Deep Learning: A Concise Review and Algorithmic Overlap | IEEE Journals & Magazine | IEEE Xplore

Automatic Image and Video Caption Generation With Deep Learning: A Concise Review and Algorithmic Overlap


Methodologies that utilize Deep Learning offer great potential for applications that automatically attempt to generate captions or descriptions about images and video fra...

Abstract:

Methodologies that utilize Deep Learning offer great potential for applications that automatically attempt to generate captions or descriptions about images and video fra...Show More

Abstract:

Methodologies that utilize Deep Learning offer great potential for applications that automatically attempt to generate captions or descriptions about images and video frames. Image and video captioning are considered to be intellectually challenging problems in imaging science. The application domains include automatic caption (or description) generation for images and videos for people who suffer from various degrees of visual impairment; the automatic creation of metadata for images and videos (indexing) for use by search engines; general-purpose robot vision systems; and many others. Each of these application domains can positively and significantly impact many other task-specific applications. This article is not meant to be a comprehensive review of image captioning; rather, it is a concise review of both image captioning and video captioning methodologies based on deep learning. This study treats both image and video captioning by emphasizing the algorithmic overlap between the two.
Methodologies that utilize Deep Learning offer great potential for applications that automatically attempt to generate captions or descriptions about images and video fra...
Published in: IEEE Access ( Volume: 8)
Page(s): 218386 - 218400
Date of Publication: 04 December 2020
Electronic ISSN: 2169-3536

References

References is not available for this document.