2

I'm implementing a neural network framework from scratch in C++ as a learning exercise. There is one concept I don't see explained anywhere clearly:

How do you go from your last convolutional or pooling layer, which is 3 dimensional, to your first fully connected layer in the network?

Many sources say, that you should flatten the data. Does this mean that you should just simply create a $1D$ vector with a size of $N*M*D$ ($N*M$ is the last convolution layer's size, and $D$ is the number of activation maps in that layer) and put the numbers in it one by one in some arbitrary order?

If this is the case, I understand how to propagate further down the line, but how does backprogation work here? Just put the values in reverse order into the activation maps?

I also read that you can do this "flattening" as a tensor contraction. How does that work exactly?

nbro
  • 42,615
  • 12
  • 119
  • 217
Á. Márton
  • 123
  • 3

1 Answers1

1

Yes, you are correct (I think it is quite easily implementable in C++ with pointers). The arbitrary order is to be maintained though, since Fully Connected Neural Nets are "Translationally Invariant" i.e. you have to make sure if pixel $(1,5,6)$ is being supplied to node $38$ or being indexed as $37$ as a single datapoint to be input to a Fully Connected Neural Network, then from then on it must be fixed (cannot put say pixel $(1,6,5 )$ in node $38$.

Backpropagation works the same as it always works, it is tough to give a verbal explanation so I will give you this picture:

enter image description here

So, basically if you visualise like this you understand you have to differentiation will propagate, the "flattening" is only reshaping the value lookup table it is not changing the way the values affect final loss, so if you take gradient w.r.t each values and then convert it back to a $3D$ map same way as before and then propagate the gradients as you were doing in previous layers.