Conferences >2025 IEEE International Solid...

23.3 EdgeDiff: 418.4mJ/Inference Multi-Modal Few-Step Diffusion Model Accelerator with Mixed-Precision and Reordered Group Quantization

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The increasing demand for image generation on mobile devices [1] highlights the need for high-performing image-generative models, including the diffusion model (DM) [2], ...Show More

Metadata

Abstract:

The increasing demand for image generation on mobile devices [1] highlights the need for high-performing image-generative models, including the diffusion model (DM) [2], [3]. A conventional DM requires numerous UNet-based denoising timesteps (~50), leading to high computation and external memory access (EMA) costs. Recently, the Few-Step Diffusion Model (FSDM) [4] was introduced, as shown in Fig. 23.3.1, to reduce the denoising timesteps to 1–4 through knowledge distillation, while maintaining high image quality, reducing computations and EMA by 22.0× and 42.3×, respectively. However, prior diffusion-model architectures, which accelerated many steps of a DM [5], [6] through inter-timestep redundancy in the UNet, fail to speed up the few denoising steps of a FSDM due to the lack of redundancy between timesteps. Moreover, a multi-modal DM introduces additional computational costs for the encoder, and a FSDM shifts computational bottlenecks from the UNet to the encoder and decoder. Additionally, a FSDM becomes more sensitive to quantization due to increased precision demands with fewer denoising steps. To tackle these challenges, we exploit mixed-precision and group quantization [7] as a unified optimization scheme applicable to the encoder, UNet, and decoder in a FSDM, even without inter-timestep redundancy.

Published in: 2025 IEEE International Solid-State Circuits Conference (ISSCC)

Date of Conference: 16-20 February 2025

Date Added to IEEE Xplore: 06 March 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/ISSCC49661.2025.10904594

Conference Location: San Francisco, CA, USA

Contents

References is not available for this document.

23.3 EdgeDiff: 418.4mJ/Inference Multi-Modal Few-Step Diffusion Model Accelerator with Mixed-Precision and Reordered Group Quantization

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

23.3 EdgeDiff: 418.4mJ/Inference Multi-Modal Few-Step Diffusion Model Accelerator with Mixed-Precision and Reordered Group Quantization

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?