Caption Generation From Road Images for Traffic Scene Modeling | IEEE Journals & Magazine | IEEE Xplore

Caption Generation From Road Images for Traffic Scene Modeling


Abstract:

In this traffic-scene-modeling study, we propose an image-captioning network which incorporates element attention into an encoder-decoder mechanism to generate more reaso...Show More

Abstract:

In this traffic-scene-modeling study, we propose an image-captioning network which incorporates element attention into an encoder-decoder mechanism to generate more reasonable scene captions. A visual-relationship-detecting network is also developed to detect the relative positions of object pairs. Firstly, the traffic scene elements are detected and segmented according to their clustered locations. Then, the image-captioning network is applied to generate the corresponding description of each traffic scene element. The visual-relationship-detecting network is utilized to detect the position relations of all object pairs in the subregion. The static and dynamic traffic elements are appropriately selected and organized to construct a 3D model according to the captions and the position relations. The reconstructed 3D traffic scenes can be utilized for the offline test of unmanned vehicles. The evaluations and comparisons based on the TSD-max, KITTI and Microsoft’s COCO datasets demonstrate the effectiveness of the proposed framework.
Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 23, Issue: 7, July 2022)
Page(s): 7805 - 7816
Date of Publication: 23 April 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.