Sometimes when I am training a DC-GAN on an image dataset, similar to the DC-GAN PyTorch example (https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html), either the Generator or Discriminator will get stuck in a large value while the other goes to zero. How should I interpret what is going on right after iteration 1500 in the example loss function image shown below  ? Is this an example of mode collapse? Any recommendations for how to make the training more stable? I have tried reducing the learning rate of the Adam optimizer with varying degrees of success. Thanks!
? Is this an example of mode collapse? Any recommendations for how to make the training more stable? I have tried reducing the learning rate of the Adam optimizer with varying degrees of success. Thanks!
 
    
    - 51
- 1
- 4
1 Answers
GANs are notably hard to train and it is not uncommon to have large bumps in the losses. The learning rate is a good start but the instability may come from a wide variety of reasons. I'm assuming that you have no bug in your code or data.
For one, gradient descent is not well suited to the 2-player game we're playing. I've personally found ExtraAdam to yield much more stable training (code, paper).
It could also come from the loss and many tricks have been developed and one of the most popular one is enforcing smoothness in the gradient (see W-GAN, W-GAN-GP etc.). SpectralNorm (code paper) is a very popular and recent normalization technique for the discriminator.
There are a number of additional tricks to make GANs work like label smoothing and flipping, different update rates for the discriminator and generators (as in BigGAN for instance). I suggest you have a look at this nice repo of (somewhat seasoned) tricks: ganhacks.
 
    
    - 276
- 1
- 4