Fusing Depths: Investigating the Synergy of Convolutional Neural Networks and Long Short-Term Memory Networks for Enhanced Image Caption Generation | IEEE Conference Publication | IEEE Xplore