Highest Voted 'reparameterization-trick' Questions - Artificial Intelligence Stack Exchange

4

votes

3 answers

In the VAE, why is $z \sim \mathcal{N}(\mu, \sigma^2)$ equivalent to $z = \mu + \sigma \odot \epsilon$?

In the reparameterization trick of a Variational Autoencoder (VAE), instead of sampling noise $z$ from $z \sim \mathcal{N}(\mu, \sigma^2)$, we can use a different method: $z = \mu + \sigma \odot \epsilon$, where $\epsilon \sim \mathcal{N}(0,1)$. I'm…

asked Mar 08 '24 at 22:58

user77925

2

votes

2 answers

Is the re-parameterization trick necessary in the policy gradient method?

If we want to learn a stochastic policy with the policy gradient method, we have to sample from the distribution to get an action. Wouldn't this lead to the same issue that variational autoencoders face without the reparameterization trick, where…

reinforcement-learning backpropagation policy-gradients sampling reparameterization-trick

asked Apr 04 '23 at 12:45

Sam

205
1
5

0

votes

1 answer

Why clamp std for reparameterization trick between -20 and 2?

In the Soft Actor Critic Paper (found here https://arxiv.org/pdf/1801.01290.pdf), they use a neural network to approximate a diagonal gaussian distribution. In the sample function you can see that it has a function called reparameterize. As you can…

reinforcement-learning soft-actor-critic reparameterization-trick

asked May 03 '23 at 13:24

chadmc

13
3

Questions tagged [reparameterization-trick]

In the VAE, why is $z \sim \mathcal{N}(\mu, \sigma^2)$ equivalent to $z = \mu + \sigma \odot \epsilon$?

Is the re-parameterization trick necessary in the policy gradient method?

Why clamp std for reparameterization trick between -20 and 2?