3

On page 27 of the DeepMind AlphaGo paper appears the following sentence:

The first hidden layer zero pads the input into a $23 \times 23$ image, then convolves $k$ filters of kernel size $5 \times 5$ with stride $1$ with the input image and applies a rectifier nonlinearity.

What does "convolves $k$ filters" mean here?

Does it mean the following:

The first hidden layer is a convolutional layer with $k$ groups of $(19 \times 19)$ neurons, where there is a kernel of $(5 \times 5 \times numChannels + 1)$ parameters (input weights plus a bias term) used by all the neurons of each group. $numChannels$ is 48 (the number of feature planes in the input image stack).

All $(19 \times 19 \times k)$ neurons' outputs are available to the second hidden layer (which happens to be another convolutional layer, but could in principle be fully connected).

?

nbro
  • 42,615
  • 12
  • 119
  • 217

0 Answers0