MO1.R2.4

Frequency Domain Diffusion Model with Scale-Dependent Noise Schedule

Amir Ziashahabi, Baturalp Buyukates, University of Southern California, United States; Artan Sheshmani, Massachusetts Institute of Technology, United States; Yi-Zhuang You, University of California San Diego, United States; Salman Avestimehr, University of Southern California, United States

Session:
Topics in Machine Learning 1

Track:
8: Machine Learning

Location:
Ypsilon I-II-III

Presentation Time:
Mon, 8 Jul, 11:05 - 11:25

Session Chair:
Deniz Gündüz, Imperial College
Abstract
Diffusion models have played a crucial role in the recent advancements in generative image modeling. These models are characterized by a forward process that incrementally corrupts images. The modeling objective is to develop a reverse process capable of reconstructing the original image from degraded inputs so that the trained model can then be leveraged to generate natural images from pure noise. In this work, we introduce a novel diffusion process that operates in the frequency domain. Typically, the frequency domain representation of an image exhibits a sparse structure, with energy predominantly concentrated in low frequency components. This inherent sparsity aids us in the effective separation of signal and noise during the reverse process. We utilize this property to introduce a scale-dependent noise schedule, offering precise control over various image scales. Working in the frequency domain allows us to modify the training protocol, resulting in significant computation enhancements, achieving a speedup of 2.7-8.5$\times$ without a significant drop in generated image quality, compared to the image domain models, which operate with fixed noise schedules.
Resources