Automated Image Annotation with Voice Synthesis Using Machine Learning | IEEE Conference Publication | IEEE Xplore

Automated Image Annotation with Voice Synthesis Using Machine Learning


Abstract:

The goal of our project is to develop automated image annotator with voice synthesis using machine learning that is developed with help of CNN (Convolutional Neural Netwo...Show More

Abstract:

The goal of our project is to develop automated image annotator with voice synthesis using machine learning that is developed with help of CNN (Convolutional Neural Network) and LSTM architecture to generate the audio and textual captions. The framework involves several steps, incorporating the image embedding, image feature discovery and retrieval using VGG16, data preprocessing, tokenization, and LSTM-based caption generation to guarantee the production of accurate and rich captions. VGG16 architecture allows efficient separation of image features and creation of descriptive, meaningful captions. The project test results show that the proposed system produces accurate and multiple captions and sounds for the images. This machine holds the potential across various applications, such as aiding people with visual impairment, understand visual details or enhancing the multi-media content by providing more descriptive captions.
Date of Conference: 12-14 July 2024
Date Added to IEEE Xplore: 04 October 2024
ISBN Information:
Conference Location: RAIPUR, India

Contact IEEE to Subscribe

References

References is not available for this document.