Loading [a11y]/accessibility-menu.js
Waving Goodbye to Low-Res: A Diffusion-Wavelet Approach for Image Super-Resolution | IEEE Conference Publication | IEEE Xplore

Waving Goodbye to Low-Res: A Diffusion-Wavelet Approach for Image Super-Resolution


Abstract:

Image Super-Resolution (SR) remains challenging, particularly in achieving high-quality details without extensive computational cost. Existing methods often struggle to b...Show More

Abstract:

Image Super-Resolution (SR) remains challenging, particularly in achieving high-quality details without extensive computational cost. Existing methods often struggle to balance the trade-off between image quality, especially in high-frequency details, and computational efficiency. In this paper, we present a novel Diffusion-Wavelet (DiWa) approach for bridging this gap. It leverages the strengths of diffusion models and discrete wavelet transformation. By enabling the diffusion model to operate in the frequency domain, our models effectively hallucinate highfrequency information for SR images on the wavelet spectrum, resulting in high-quality and detailed reconstructions in image space. Quantitatively, our method outperforms other state-ofthe-art diffusion-based SR methods, namely SR3 and SRDiff, regarding PSNR, SSIM, and LPIPS on both face (8x scaling) and general (4x scaling) SR benchmarks. Meanwhile, using the frequency domain allows us to use fewer parameters than the compared models: 92M parameters instead of 550M compared to SR3 and 9.3M instead of 12M compared to SRDiff. Additionally, DiWa outperforms other state-of-the-art generative methods on general SR datasets while saving inference time (ca. 250 %).
Date of Conference: 30 June 2024 - 05 July 2024
Date Added to IEEE Xplore: 09 September 2024
ISBN Information:

ISSN Information:

Conference Location: Yokohama, Japan

I. Introduction

Incorporating Diffusion Models (DMs) began a new era of image Super-Resolution (SR) innovation. While regressionbased methods like standard CNNs may work at low magnification ratios, they often fail to produce the high-frequency details needed for high magnification ratios. Generative models and, more recently, DMs have proven to be effective tools for tackling this issue [1]–[3]. Moreover, DMs produce reconstructions with subjectively perceived better quality compared to regression-based methods [4]. Nevertheless, closing the gap between quantitative image quality and human preferences requires finer high-frequency detail prediction to enhance the overall realism [5]. Another pressing demand is accessibility due to computationally intensive requirements of DMs [6]. For example, creating 50,000 small images (32×32) using a DM can take ca. 20 hours due to the iterative process, but a GAN can do this in under a minute on a Nvidia 2080 Ti GPU.

Contact IEEE to Subscribe

References

References is not available for this document.