Is there any rule of thumb for choosing the number of hidden units in an LSTM? Is it similar to hidden neurons in a regular feedforward neural network? I'm getting better results with my LSTM when I have a much bigger amount of hidden units (like 300 Hidden units for a problem with 14 inputs and 5 outputs), is it normal that hidden units in an LSTM are usually much more than hidden neurons in a feedforward ANN? or am I just greatly overfitting my problem?
Asked
Active
Viewed 1,297 times
1 Answers
0
I'm not sure about what you are referring to when you say "number of hidden units", but I will assume that it's the dimension of the hidden vector $h_t \in \mathbb{R}^N$ in this definition of an LSTM.
In general, the larger your model, in your case $N$, the more capacity your model has and, therefore, the more complex a function it can represent.
If by "better result" you mean smaller loss on the training dataset, it's very likely that you will indeed overfit more the more you increase $N$.
However, there are many techniques to increase your model expressiveness without overfitting, such as dropout.
Raphael Lopez Kaufman
- 608
- 3
- 9