Questions tagged [diffusion-models]
76 questions
10
votes
2 answers
Using AI to extend an imagine pattern
I have created some nice patterns using the MidJourney tool. I'd like to find a way to extend these patterns, and I was thinking about an AI tool that takes one of these patterns and extends it in all directions surrounding the original…
Nicola Lepetit
- 203
- 2
- 8
6
votes
5 answers
Why does Stable Diffusion use VAE instead of AE?
I am currently studying the Latent Diffusion Models (LDMs) and am interested in training my own model using a unique dataset. In my research, I came across Stable Diffusion (SD). Some sources suggest that SD employs VAEs for the encoding and…
P0TAT0
- 61
- 1
- 2
4
votes
2 answers
Any popular diffusion model for language modeling?
Is there a popular diffusion model-based framework for language modelling? If not, is it because of the difficulty sampling for discrete distributions?
Ayan Sengupta
- 41
- 1
4
votes
2 answers
Do LLMs based on a diffusion model (as opposed to an autoregressive model) exist?
Is there a such thing (described in title), at least in research papers (not actual models)?
So far all LLMs that I know are autoregressive models.
anon
3
votes
2 answers
How does diffusion based text-to-image generation models Mathematically classify inputs to outputs?
I've been exploring the capabilities of the Diffusion based text-to-image models and am curious about its underlying mathematical framework. Specifically, I'm interested in understanding how the model handles the mapping from textual inputs to image…
hanugm
- 4,102
- 3
- 29
- 63
3
votes
1 answer
How do stable diffusion models take the data into account
I'm interested in how text to image models like Midjourney and Dall-E work, where you enter a text prompt and get as output some images. I started reading some papers on it and stumbled upon "Denoising Diffusion Probabilistic Models" -…
Rohit Pandey
- 161
- 4
3
votes
1 answer
Understanding the function of attention layers in a convolutional neural network (U-Net in a diffusion model)
I am trying to understand the neural network architecture used by Ho et al. in "Denoising Diffusion Probabilistic Models" (paper, source code). They include self-attention layers in the model, applying them to the feature maps output by previous…
Rational Function
- 151
- 4
3
votes
0 answers
Relation between SDE diffusion and DDPM/DDIM
Background & Definitions
In DDPM, the diffusion backward step is described as follows (where $z\sim \mathcal{N}(0,I)$ and $x_{T}\sim \mathcal{N}(0,I)$):
and in DDIM we have
while in the SDE formulation (from the Fokker-Planck equation) the step…
snatchysquid
- 89
- 6
3
votes
1 answer
Understanding the functionality of the switch in the latent diffusion models: Does conditioning information pass to both cross attention and $z_{T}$?
Consider the following diagram from the paper titled High-Resolution Image Synthesis with Latent Diffusion Models by Robin Rombach et. al.,
In the context of this diagram, I'm uncertain about the functionality of a particular component referred to…
hanugm
- 4,102
- 3
- 29
- 63
3
votes
1 answer
What's the architecture that allows the generation of new images based on input image in tools like Midjourney?
I understand that the high-level architecture of tools like Midjourney use diffusion models to generate images from text. What I don't understand is which type of network architecture allow the second step of their workflow - generating new, similar…
emilaz
- 133
- 4
3
votes
1 answer
Clarification on the training objective of denoising diffusion models
I'm reading the Denoising Diffusion Probabilistic Models paper (Ho et al. 2020). And I am puzzled about the training objective. I understood (I think) the trick regarding the reparametrization of the variance in terms of the noise:
$$\mu_\theta(x_t,…
user3903647
- 31
- 1
3
votes
0 answers
Reverse Process in Diffusion Model Doesn't Return Original Image
I am attempting to program a Denoising Diffusion Model based on the one introduced in the article by Ho et al. (2020). However, I have run into issues while testing the reverse diffusion process.
Walking through my PyTorch code, I first load an…
cabralpinto
- 31
- 4
2
votes
1 answer
Characterization of a coupling based on an ODE solver
Let us assume $ p_0 $ is a probability distribution over $ \mathbb{R}^d $. Let $ x_t $ be a diffusion process defined as:
\begin{equation}
x_t = x + \sigma_t z,
\end{equation}
where $ x \sim p_0 $, $ z \sim \mathit{N}(0,I) $, and $ \sigma_t <…
tzb
- 21
- 2
2
votes
1 answer
Are there any attempts to expand diffusion model denoising to other noise types?
They usually denoise Gaussian noise, can they be applied to denoise other types of noises, like Perlin, Simplex, value and cell noise? This assumes that the reverse denoising process is still applied.
user86196
2
votes
1 answer
Deriving ELBO for Diffusion Models
I am trying to read through the proof of ELBO for diffusion models on pg. 8 of this paper. However, I do not see how the author arrived at Eqn (45) from Eqn (44). Specifically, I do not know how they simplified the equation by rewriting it in terms…
Nikhil Sridhar
- 23
- 2