0

Every explanation of variational inference starts with the same basic premise: given an observed variable $x$, and a latent variable $z$,

$$ p(z|x)=\frac{p(x,z)}{p(x)} $$

and then proceeds to expand $p(x)$ as an expectation over $z$:

$$ p(x) = \int{p(x,z)dz} $$

and then states that it's too difficult to evaluate.

My very very basic question is why is $p(x)$ not simply equal to 1? It's an observed variable!

nbro
  • 42,615
  • 12
  • 119
  • 217

1 Answers1

0

You're forgetting that $x$ can assume several values with different probability. Let's say that $x$ represent the roll of a fair dice. Then $p(x)$ will be 1/6 for all six possible values of $x$.

$$ p(\theta|x)=\frac{p(x|\theta)p(\theta)}{p(x)} $$

If you rearrange the formula it becomes clear that the whole point of Bayes theorem is to say that we want matching prior $p(\theta)$ and posterior $p(\theta|x)$ distributions.

$$ \frac{p(\theta|x)}{p(\theta)}=1=\frac{p(x|\theta)}{p(x)} $$

cause if the left side holds, then it means that the likelihood predicted by our parameters $p(x|\theta)$ is equal to the real observed probability $p(x)$. So if we train a model on some observed dice rolls, we expect a perfect model to learn for each face of the dice the real probability $p(x)$ which is 1/6 and not 1.

Edoardo Guerriero
  • 5,506
  • 1
  • 15
  • 25