Conferences >2022 3rd International Confer...

Dense Video Captioning using BiLSTM Encoder

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Video captioning has been a widely researched topic integrating visual information and natural language but performing video captioning on long untrimmed videos is still ...Show More

Metadata

Abstract:

Video captioning has been a widely researched topic integrating visual information and natural language but performing video captioning on long untrimmed videos is still challenging as the video contains multiple events and the model has to describe each event. To address this issue, this paper discusses work on dense video captioning, a newly emerging research subject that entails presenting temporal events in a video and creating captions for each temporal event. Proposed architecture comprises an event proposal module, an EfficientNet B7 network for feature extraction from sampled frames, and BiLSTM encoder and LSTM decoder for captioning. BILSTM encoder effectively utilizes both past and future contexts from the video for generating captions. This model is trained and tested on MSVD dataset which has around 2000 videos and their corresponding captions. The proposed framework shows increased accuracy in video captioning in terms of BLEU score 0.78 and METEOR score 0.34.

Published in: 2022 3rd International Conference for Emerging Technology (INCET)

Date of Conference: 27-29 May 2022

Date Added to IEEE Xplore: 15 July 2022

ISBN Information:

DOI: 10.1109/INCET54531.2022.9824569

Conference Location: Belgaum, India

Contents

References is not available for this document.

Dense Video Captioning using BiLSTM Encoder

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Dense Video Captioning using BiLSTM Encoder

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?