3

I have been reading this TensorFlow tutorial on transfer learning, where they unfroze the whole model and then they say:

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training=False when calling the base model. Otherwise the updates applied to the non-trainable weights will suddenly destroy what the model has learned.

My question is: why? The model's weights are adapting to the new data, so why do we keep the old mean and variance, which was calculated on ImageNet? This is very confusing.

nbro
  • 42,615
  • 12
  • 119
  • 217
dato nefaridze
  • 882
  • 10
  • 22

0 Answers0