1

I have way more unlabeled data than labeled data. Therefore I would like to train an autoencoder using MobileNetV2 as the encoder. Then I will use the pre-trained model for the classification of the labeled data.

I think it is rather difficult to "invert" the MobileNet architecture to create a decoder. Therefore, my question is: can I use a different architecture for the decoder, or will this introduce weird artefacts?

nbro
  • 42,615
  • 12
  • 119
  • 217
oezguensi
  • 205
  • 1
  • 8

2 Answers2

2

Other replies are commenting on the skip connections for a U-Net. I believe you want to exclude these skip connections from your auto-encoder. You say you want to use the auto-encoder for unsupervised pretraining, for which you want to pass the data through a bottle neck, so adding skip connections would work against you if you want to use the encoder for a classification task.

You ask whether the decoder should 'mirror' the MobileNet encoder. This is actually an interesting one, and I think it could work even if the decoder does not look like the encoder at all. Since you don't need to (and in fact shouldn't) add skip connections, this should be easy to try.

rknoops
  • 21
  • 2
1

can I use a different architecture for the decoder, or will this introduce weird artifacts?

If you are using U-net -like architecture with skip connection from corresponding encoding to decoding layer outputs of corresponding layers should have the same spatial resolution. There is no other commonly recognized limitations on decoder architecture for convolutional networks.

mirror2image
  • 735
  • 7
  • 15