Sketch-Guided Text-to-Image Generation with Spatial Control | IEEE Conference Publication | IEEE Xplore

Sketch-Guided Text-to-Image Generation with Spatial Control


Abstract:

Recent text-to-image generation models can produce high-quality images from textual prompts. However, it is difficult to correctly interpret instructions specifying the c...Show More

Abstract:

Recent text-to-image generation models can produce high-quality images from textual prompts. However, it is difficult to correctly interpret instructions specifying the complex images with multiple objects using only texts. To solve this issue, we propose a sketch-guided spatial control for text-to-image diffusion models. In the feature extraction stage of the proposed framework, sketch inputs are segmented into individual objects using the image segmentation approach. The obtained bounding boxes and labels are used as spatial-guided inputs into the attention layers of the diffusion model. For the image generation stage, the proposed model utilizes a pretrained text-to-image diffusion model as the image generator. We assess the proposed method through both quantitative and qualitative evaluations, demonstrating its versatility in spatial control based on user sketches.
Date of Conference: 12-14 January 2024
Date Added to IEEE Xplore: 21 May 2024
ISBN Information:
Conference Location: Kyoto, Japan

Contact IEEE to Subscribe

References

References is not available for this document.