3

I have a variational convolutional autoencoder that has trained on 2 images and outputs a linear interpolation (inserted at the bottleneck stage) between those 2 input images.

enter image description here

However, the result looks (rather dissapointingly) like some Powerpoint cross-fade effect.

What would be needed to obtain a more novel transition effect, for instance limbs moving from '1' and seamlessly reshaping into '7'?

James
  • 167
  • 4

1 Answers1

3

You need more training images. Far more, at least a few hundred, with variations. The latent space has no meaningful form to it when you train with just two end points. The decoder will have no examples of shifting forms that it could arrange on the path between typical 1s and typical 7s.

Also, if there are separate classes within the encoded space, representations in the middle are not guaranteed to be coherent transformations between them. More training examples will help, but some parts of the latent space will be odd. In human face examples you can often observe this when latent space crosses between glasses and no glasses, the transition can fall apart.

Neil Slater
  • 33,739
  • 3
  • 47
  • 66