I'm trying to implement example from a classic AI paper named "Learning representations by back-propagating errors" by Hinton et al.
Example aims at training network able to predict third term in triples of (person_0, relationship, person_1) across small artificial dataset. Neural net consists of 5 FC layers with sigmoid activations. There are 24 input units to encode person_0 and 12 input units to encode relationship. Each out of 24 output units corresponds to particular person (person_1) across two family trees. Details are in the paper here: http://www.cs.utoronto.ca/~hinton/absps/naturebp.pdf
What happens is that instead of learning to predict correct third terms of triples (by producing ~0.8 values in right output units and ~0.2s in wrong ones), network becomes indifferent to input and always returns the same output - that apparently represents distribution of third terms across training set.
I'm not sure why I can't reproduce the findings from paper. Other example from paper working fine + fact that some learning occurs for this example (loss ~350 -> ~50) makes me feel like backpropagation code here is bug free.
Any help is highly appreciated.
The code I'm using + more detailed description of the issue are available here: https://github.com/jan-grzybek/papers/tree/main/3236088
Thanks!