1

I'm writing up about different model architectures used in NLP, namely encoder-only models, encoder-decoder-only models, and have come across what seems to be a naming inconsistency. For decoder-only models it seems that they can be referred to as autoregressive models since they autoregressively predict the next token and encoder-decoder models can be referred to as sequence-to-sequence since they map a sequence of tokens to a different sequence of tokens.

However, I found various sources conflicting each other as to what an autoencoder is. While many sources claim that encoder-only models are autoencoders, others also refer to encoder-decoder models as being autoencoders.

KurtMica
  • 111
  • 3

1 Answers1

2

You will find that the nomenclature in this field is not particularly rigorous. One of the main reasons I visit this site is to keep up with the new names professors (and others) are using for existing things.

That said, an autoencoder is comprised of an encoder, a latent space (some type of information choke point), and a decoder. Based on this and given the options you mention, a strong argument can be made that it is an encoder-decoder model, however it is not a sequence-to-sequence model at all. It is simply attempting to reconstruct the input, not predict a future value, token, or anything else.

David Hoelzer
  • 1,198
  • 11
  • 21