8

I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be parallelized.

Is this true? Is LSTM training on large GPUs, like 1080 Ti, faster than on CPUs?

nbro
  • 42,615
  • 12
  • 119
  • 217
Dieshe
  • 289
  • 1
  • 2
  • 6

2 Answers2

9

From Nvidia www (https://developer.nvidia.com/discover/lstm):

Accelerating Long Short-Term Memory using GPUs

The parallel processing capabilities of GPUs can accelerate the LSTM training and inference processes. GPUs are the de-facto standard for LSTM usage and deliver a 6x speedup during training and 140x higher throughput during inference when compared to CPU implementations. cuDNN is a GPU-accelerated deep neural network library that supports training of LSTM recurrent neural networks for sequence learning. TensorRT is a deep learning model optimizer and runtime that supports inference of LSTM recurrent neural networks on GPUs. Both cuDNN and TensorRT are part of the NVIDIA Deep Learning SDK.

nbro
  • 42,615
  • 12
  • 119
  • 217
pasaba por aqui
  • 1,313
  • 7
  • 21
6

I found that there are cuDNN accelerated cells in Keras, for example, https://keras.io/layers/recurrent/#cudnnlstm. They are very fast. The normal LSTM cells are faster on CPU than on GPU.

nbro
  • 42,615
  • 12
  • 119
  • 217
Dieshe
  • 289
  • 1
  • 2
  • 6