Questions tagged [diffusion-models]

76 questions
10
votes
2 answers

Using AI to extend an imagine pattern

I have created some nice patterns using the MidJourney tool. I'd like to find a way to extend these patterns, and I was thinking about an AI tool that takes one of these patterns and extends it in all directions surrounding the original…
6
votes
5 answers

Why does Stable Diffusion use VAE instead of AE?

I am currently studying the Latent Diffusion Models (LDMs) and am interested in training my own model using a unique dataset. In my research, I came across Stable Diffusion (SD). Some sources suggest that SD employs VAEs for the encoding and…
P0TAT0
  • 61
  • 1
  • 2
4
votes
2 answers

Any popular diffusion model for language modeling?

Is there a popular diffusion model-based framework for language modelling? If not, is it because of the difficulty sampling for discrete distributions?
4
votes
2 answers

Do LLMs based on a diffusion model (as opposed to an autoregressive model) exist?

Is there a such thing (described in title), at least in research papers (not actual models)? So far all LLMs that I know are autoregressive models.
anon
3
votes
2 answers

How does diffusion based text-to-image generation models Mathematically classify inputs to outputs?

I've been exploring the capabilities of the Diffusion based text-to-image models and am curious about its underlying mathematical framework. Specifically, I'm interested in understanding how the model handles the mapping from textual inputs to image…
hanugm
  • 4,102
  • 3
  • 29
  • 63
3
votes
1 answer

How do stable diffusion models take the data into account

I'm interested in how text to image models like Midjourney and Dall-E work, where you enter a text prompt and get as output some images. I started reading some papers on it and stumbled upon "Denoising Diffusion Probabilistic Models" -…
3
votes
1 answer

Understanding the function of attention layers in a convolutional neural network (U-Net in a diffusion model)

I am trying to understand the neural network architecture used by Ho et al. in "Denoising Diffusion Probabilistic Models" (paper, source code). They include self-attention layers in the model, applying them to the feature maps output by previous…
3
votes
0 answers

Relation between SDE diffusion and DDPM/DDIM

Background & Definitions In DDPM, the diffusion backward step is described as follows (where $z\sim \mathcal{N}(0,I)$ and $x_{T}\sim \mathcal{N}(0,I)$): and in DDIM we have while in the SDE formulation (from the Fokker-Planck equation) the step…
3
votes
1 answer

Understanding the functionality of the switch in the latent diffusion models: Does conditioning information pass to both cross attention and $z_{T}$?

Consider the following diagram from the paper titled High-Resolution Image Synthesis with Latent Diffusion Models by Robin Rombach et. al., In the context of this diagram, I'm uncertain about the functionality of a particular component referred to…
hanugm
  • 4,102
  • 3
  • 29
  • 63
3
votes
1 answer

What's the architecture that allows the generation of new images based on input image in tools like Midjourney?

I understand that the high-level architecture of tools like Midjourney use diffusion models to generate images from text. What I don't understand is which type of network architecture allow the second step of their workflow - generating new, similar…
emilaz
  • 133
  • 4
3
votes
1 answer

Clarification on the training objective of denoising diffusion models

I'm reading the Denoising Diffusion Probabilistic Models paper (Ho et al. 2020). And I am puzzled about the training objective. I understood (I think) the trick regarding the reparametrization of the variance in terms of the noise: $$\mu_\theta(x_t,…
3
votes
0 answers

Reverse Process in Diffusion Model Doesn't Return Original Image

I am attempting to program a Denoising Diffusion Model based on the one introduced in the article by Ho et al. (2020). However, I have run into issues while testing the reverse diffusion process. Walking through my PyTorch code, I first load an…
2
votes
1 answer

Characterization of a coupling based on an ODE solver

Let us assume $ p_0 $ is a probability distribution over $ \mathbb{R}^d $. Let $ x_t $ be a diffusion process defined as: \begin{equation} x_t = x + \sigma_t z, \end{equation} where $ x \sim p_0 $, $ z \sim \mathit{N}(0,I) $, and $ \sigma_t <…
tzb
  • 21
  • 2
2
votes
1 answer

Are there any attempts to expand diffusion model denoising to other noise types?

They usually denoise Gaussian noise, can they be applied to denoise other types of noises, like Perlin, Simplex, value and cell noise? This assumes that the reverse denoising process is still applied.
user86196
2
votes
1 answer

Deriving ELBO for Diffusion Models

I am trying to read through the proof of ELBO for diffusion models on pg. 8 of this paper. However, I do not see how the author arrived at Eqn (45) from Eqn (44). Specifically, I do not know how they simplified the equation by rewriting it in terms…
1
2 3 4 5 6