How Many Hidden Units in an LSTM?

Question

Is there any rule of thumb for choosing the number of hidden units in an LSTM? Is it similar to hidden neurons in a regular feedforward neural network? I'm getting better results with my LSTM when I have a much bigger amount of hidden units (like 300 Hidden units for a problem with 14 inputs and 5 outputs), is it normal that hidden units in an LSTM are usually much more than hidden neurons in a feedforward ANN? or am I just greatly overfitting my problem?

score 0 · Answer 1 · answered May 05 '22 at 18:59

I'm not sure about what you are referring to when you say "number of hidden units", but I will assume that it's the dimension of the hidden vector $h_t \in \mathbb{R}^N$ in this definition of an LSTM.

In general, the larger your model, in your case $N$, the more capacity your model has and, therefore, the more complex a function it can represent.

If by "better result" you mean smaller loss on the training dataset, it's very likely that you will indeed overfit more the more you increase $N$.

However, there are many techniques to increase your model expressiveness without overfitting, such as dropout.

How Many Hidden Units in an LSTM?

1 Answers1