Loading [MathJax]/extensions/MathMenu.js
Additive Attention for Medical Visual Question Answering | IEEE Conference Publication | IEEE Xplore

Additive Attention for Medical Visual Question Answering


Abstract:

Medical visual question answering is a prominent research area, presenting significant challenges within the domain of visual question answering. In traditional medical v...Show More

Abstract:

Medical visual question answering is a prominent research area, presenting significant challenges within the domain of visual question answering. In traditional medical visual question answering, the initial step commonly involves employing Convolutional Neural Network (CNN) for image information extraction. Subsequently, bilinear attention mechanisms are employed to merge textual question characteristics with image visual features. However, the method of extracting visual attributes through convolutional neural networks often overlooks the global contextual within the image, which is crucial for answering questions accurately. Consequently, this paper introduces an Additive Attention Network (AANet) to capture comprehensive image information features. Specifically, CNN is employed to obtain local visual features of images, while the additive attention mechanism serves to acquire global contextual features of images. These components complement each other, enhancing the representation of visual features and augmenting the model’s global contextual awareness capability. The proposed method demonstrated superior performance on the VQA-RAD dataset, achieving an overall accuracy of 72.5%, especially for closed questions, achieving an accuracy of 81.9%.
Date of Conference: 17-19 November 2023
Date Added to IEEE Xplore: 13 March 2024
ISBN Information:
Conference Location: Qiangdao, China

Contact IEEE to Subscribe

References

References is not available for this document.