Loading [a11y]/accessibility-menu.js
Synthesizing Realistic Images from Textual Descriptions: A Transformer-Based GAN Approach | IEEE Conference Publication | IEEE Xplore

Synthesizing Realistic Images from Textual Descriptions: A Transformer-Based GAN Approach


Abstract:

The ability to automatically generate realistic images from textual input is a challenging and important goal in artificial intelligence. In this research, a novel approa...Show More

Abstract:

The ability to automatically generate realistic images from textual input is a challenging and important goal in artificial intelligence. In this research, a novel approach is represented that combines RoBERTa a transformer-based language model, with Generative Adversarial Networks (GANs) to synthesize high-quality images from textual description. The proposed architecture uses RoBERTa model to encode the text input and Generative Adversarial Network that relates it to each pixel to produce an image that closely represents the input. The quality of the synthesized images is measured using Fréchet Inception Distance (FID) and Inception Score (IS) metrics. Three variants of the architecture have been proposed and demonstrated that this approach can produce realistic images. The results indicate that transformer-based language models can effectively be used with GANs for image synthesis, thus paving the way for further research in this area.
Date of Conference: 16-17 June 2023
Date Added to IEEE Xplore: 21 August 2023
ISBN Information:
Conference Location: Gazipur, Bangladesh

Contact IEEE to Subscribe

References

References is not available for this document.