1

I have recently been learning about diffusion models and trying to derive all the results in the paper by Sohl-Dickstein, et. al, "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" (2015): https://arxiv.org/pdf/1503.03585.pdf

In the Appendix B of this paper, they define the log-likelihood lower bound term as, $$K = \int{dx^{(0...T)}} q(x^{(0...T)}) \text{log} \Bigg[p(x^{(T)}) \prod_{t=1}^{T} \frac{p(x^{(t-1)}|x^{(t)})}{q(x^{(t)}|x^{(t-1)})} \Bigg]$$.

They further try to isolate the $p(x^{(T)})$ in the square bracket. The next step of the calculation should be,

$$K = \int{dx^{(0...T)}} q(x^{(0...T)}) \text{log} \sum_{t=1}^{T}\Bigg[ \frac{p(x^{(t-1)}|x^{(t)})}{q(x^{(t)}|x^{(t-1)})} \Bigg] + \int{dx^{(0...T)}} q(x^{(0...T)}) \text{ log } p(x^{(T)})$$

The first integral above looks the same as the first term in the paper, I am trying to make the second integral above go to $$\int dx^{(T)} q(x^{(T)}) \text{ log } p(x^{(T)})$$

For this, I am using the fact that $q(x^{(0...T)})$ can be written as,

$$q(x^{(0...T)}) = q(x^{(0)}) \prod_{t=1}^{T} q(x^{(t)}|x^{(t-1)})$$

Then I am writing the second integral above as,

$$\int{dx^{(0...T)}} q(x^{(0...T)}) \text{ log } p(x^{(T)}) \\= \int dx^{(0)} q(x^{(0)}) \int dx^{(1)}q(x^{(1)}|x^{(0)}) \ldots \int dx^{(T-1)}q(x^{(T-1)}|x^{(T-2)}) \int dx^{(T)}q(x^{(T)}|x^{(T-1)}) \text{ log } p(x^{(T)}) \\ = \int dx^{(0)} q(x^{(0)}) \Bigg(\prod_{t=1}^{T-1} \int dx^{(t)}q(x^{(t)}|x^{(t-1)}) \Bigg) \int dx^{(T)}q(x^{(T)}|x^{(T-1)}) \text{ log } p(x^{(T)})$$

Question 1: Is the above expression correct?

Moreover, I think we can also express any $q(x^{(t)}|x^{(t-1)}) = \frac{q(x^{(t)}, x^{(t-1)})}{q(x^{(t-1)})}$ (using conditional probability rule). Does that mean that each of the integrals within the parenthesis in the previous step can be written as,

$$ \int dx^{(t)}q(x^{(t)}|x^{(t-1)}) = \int dx^{(t)} \frac{q(x^{(t)}, x^{(t-1)})}{q(x^{(t-1)})} = \frac{1}{q(x^{(t-1)})} \int dx^{(t)} q(x^{(t)}, x^{(t-1)}) = \frac{1}{q(x^{(t-1)})} q(x^{(t-1)}) = 1$$

Also, we have $\int{dx^{(0)}} q(x^{(0)}) = 1$. Following the above line of thought, the second integral becomes,

$$\int dx^{(T)}q(x^{(T)}|x^{(T-1)}) \text{ log } p(x^{(T)})$$

I am not able to proceed further than this step.

Question 2: Can someone point out if there is a mistake in one of my previous steps and how to proceed with getting the required expression for the second integral term of $K$?

ahxmeds
  • 31
  • 2

1 Answers1

0

Question 1 is indeed correct. From there, the integral for $\int dx^{(t)} q(x^{(t)}|x^{(t-1)})$ is also correct. The problem is that this expression does not appear in your intended integral. In $$\int dx^{(0)} q(x^{(0)}) \Bigg(\prod_{t=1}^{T-1} \int dx^{(t)}q(x^{(t)}|x^{(t-1)}) \Bigg) \int dx^{(T)}q(x^{(T)}|x^{(T-1)}) \text{ log } p(x^{(T)})$$ if you want to integrate out the $x^{(t)}$ variable you actually need to simplify: $$ \int dx^{(t)} q(x^{(t)}|x^{(t-1)}) q(x^{(t+1)}|x^{(t)}) $$ for $t \notin {0, T}$. For a Markov chain this is doable, using the Chapman–Kolmogorov equation, which implies: $$ \int dx^{(t)} q(x^{(t)}|x^{(t-1)}) q(x^{(t+1)}|x^{(t)}) = q(x^{(t+1)}|x^{(t-1)}) $$

Doing the above for every single variable except $x^{(T)}$ and $x^{(0)}$ leads to: $$\int dx^{(T)} \int dx^{(0)} q(x^{(0)}) q(x^{(T)}|x^{(0)}) \log p(x^{(T)}) = \int dx^{(T)} q(x^{(T)}) \log p(x^{(T)})$$ where in the end I integrated on $x^{(0)}$ and obtained the term you wanted.