They usually denoise Gaussian noise, can they be applied to denoise other types of noises, like Perlin, Simplex, value and cell noise? This assumes that the reverse denoising process is still applied.
1 Answers
It seems that several papers have explored diffusion models with non-gaussian noises:
Nachmani, Eliya, Robin San Roman, and Lior Wolf. "Non gaussian Denoising Diffusion Models." arXiv, June 14, 2021. https://doi.org/10.48550/arXiv.2106.07582.
Yoon, Eunbi, Keehun Park, Sungwoong Kim, and Sungbin Lim. "Score-Based Generative Models with Lévy Processes," 2023. https://openreview.net/forum?id=0Wp3VHX0Gm.
Zhou, Mingyuan, Tianqi Chen, Zhendong Wang, and Huangjie Zheng. "Beta Diffusion." arXiv, September 14, 2023. https://doi.org/10.48550/arXiv.2309.07867.
Li, Henry. "Non-Normal Diffusion Models," 2023. https://openreview.net/forum?id=6N8L9D5M9L#all.
However, lack of volume in this area seems suggesting the idea of using non-gaussian noises still remains more of a curiosity than a serious direction of study, probably due to a lack of compelling motivation.
Recall that the ultimate purpose of a generative model is to generate samples from a complex distribution, say $q_{\text{data}}(\mathbf{x})$. Theories and implementations that accomplish this task efficiently and effectively are more likely to gain attention and widespread use.
Now, to investigate possible reasons as to why non-gaussian noises are not actively sought, let's examine how diffusion models (DMs) approach the goal of learning to sample from a complex distribution. At an abstract level, the design of a DM involves the following steps:
Choose an easy-to-sample source distribution $p_0(\mathbf{x}_0)$.
(Note: This follows the convention from flow-matching literature. In typical DM literature, the source distribution is usually chosen as the terminal condition, with the data distribution as the initial condition.)
Connect $p_0(\mathbf{x}_0)$ to the target distribution $p_1(\mathbf{x}_1) = q_{\text{data}}(\mathbf{x}_1)$ via a probability path $(p_t(\mathbf{x}_t))_{t\in[0,1]}$. While there is considerable freedom in choosing this path, certain choices prove more tractable and easier to implement than others.
Determine a suitable joint distribution $p(\mathbf{x}_{t\in[0,1]})$. Equivalently, realize a suitable random process $t \mapsto \mathbf{x}_t$ with its marginal at time $t$ equal to $p_t(\mathbf{x}_t)$ for each $t \in [0, 1]$.
Example: Diffusion Models
A diffusion model (DM) constructs the probability path as:
$$p_t = q_{\text{data}} \ast \mathcal{N}(\mathbf{0}, \beta_t \mathbf{I}), $$
where $\ast$ denotes convolution and $(\beta_t)_{t\in[0,1]}$ is a positive decreasing function representing the noise schedule. There are primarily two approaches to realizing a process $\mathbf{x}_{t\in[0,1]}$ with these marginals:
DDPM and its variants: These models progressively inject gaussian noise into the sample. As a result, both the forward and backward processes solve a specific stochastic differential equation (SDE) for diffusion.
DDIM and its variants: These models fix a single noise source $\boldsymbol{\epsilon}$ and set $\mathbf{x}_t$ as a linear combination of the data point and the noise $\boldsymbol{\epsilon}$. The coefficients of this combination vary over time, representing the data at one end and the noise at the other. The sampling process for these models solves a particular ordinary differential equation (ODE) known as the probability-flow ODE.
In summary, a diffusion model interpolates the data distribution and the source gaussian distribution via diffusion, and the corresponding process can be simulated (hence generates samples) by solving either the SDE or ODE using various integrators.
Now that we reviewed some key ideas, let's discuss why non-gaussian diffusion models have not gained as much traction:
While restricting noise to gaussian distributions might seem limiting at a first glance, it's known that essentially all "well-behaved" continuous stochastic processes $\mathbf{x}_{t\in[0,1]}$ can be realized using gaussian noise. (This is related to the functional central limit theorem.) Exceptions are typically pathological cases demonstrating either: a) Discontinuous or extremely irregular sample paths, or b) Long-range dependency over time.
Finally, although this claim no longer holds for discrete-time processes, restricting time steps to certain values then significantly limits the flexibility of the model. In summary, processes that evolve from gaussian noise injection seems hitting the sweet spot, balancing various desiderata for a generative model.
Gaussian noise offers particular advantages in making the sampling procedure tractable. Denoising a sample with gaussian noise is especially convenient, thanks to a result known as Tweedie's formula. This also naturally brings the score function to the forefront of diffusion model research.
Many generative models, including flow matching, rectified flows, stochastic interpolants, and Poisson flow generative models, fit neatly into the scheme described earlier. The vast diversity of possible probability paths and their sample-path-wise realizations offers substantial flexibility in extending the model, surpassing the flexibility offered by non-gaussian DMs.
- 188
- 4