Abstract:
Text-based image manipulation is a popular subject and has many applications. However, it is a challenging task because there is no ground-truth edited dataset and textua...Show MoreMetadata
Abstract:
Text-based image manipulation is a popular subject and has many applications. However, it is a challenging task because there is no ground-truth edited dataset and textual descriptions have abstractive and ambiguous properties. To alleviate the difficult issues, we propose a manipulation framework consisting of the proposal attentional GANs, language-related semantic mask, and language-guided ranker. Specially, we construct an editing proposal generator to generate the suitable edited proposals with and without semantic conditions, which supports the reorganization of sub-generators to output proposals in various aspects as many as possible. To distinguish the text-relevant and the text-irrelevant regions, we introduce a language-related semantic mask based on the source image and target caption. Then, we exploit a language-guided ranker to retrieve the best edited result from the edited proposals through using the multi-modal similarity and the language-related semantic mask. Extensive experiments on widely-used datasets demonstrate that our model could manipulate images interactively and improve the editing quality effectively.
Published in: IEEE Transactions on Multimedia ( Volume: 25)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Use Of Imaging ,
- Source Images ,
- Textual Descriptions ,
- Proposal Generation ,
- Natural Language ,
- Visual Features ,
- Attention Mechanism ,
- Autoencoder ,
- RGB Images ,
- Attention Module ,
- Word Embedding ,
- Matching Score ,
- Trivial Task ,
- Detailed Visualization ,
- Image Editing ,
- Input Elements ,
- Correct Features ,
- Focus Of Future Work ,
- Editing Levels ,
- Target Text ,
- Sentence Embedding ,
- Scene Graph ,
- StyleGAN
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Use Of Imaging ,
- Source Images ,
- Textual Descriptions ,
- Proposal Generation ,
- Natural Language ,
- Visual Features ,
- Attention Mechanism ,
- Autoencoder ,
- RGB Images ,
- Attention Module ,
- Word Embedding ,
- Matching Score ,
- Trivial Task ,
- Detailed Visualization ,
- Image Editing ,
- Input Elements ,
- Correct Features ,
- Focus Of Future Work ,
- Editing Levels ,
- Target Text ,
- Sentence Embedding ,
- Scene Graph ,
- StyleGAN
- Author Keywords