11

I'm attempting to program my own system to run a neural network. To reduce the number of nodes needed, it was suggested to make it treat rotations of the input equally.

My network aims to learn and predict Conway's Game of Life by looking at every square and its surrounding squares in a grid, and giving the output for that square. Its input is a string of 9 bits:

Glider

The above is represented as 010 001 111.

There are three other rotations of this shape however, and all of them produce the same output:

Glider rotations

My network topology is 9 input nodes and 1 output node for the next state of the centre square in the input. How can I construct the hidden layer(s) so that they take each of these rotations as the same, cutting the number of possible inputs down to a quarter of the original?

Edit:

There is also a flip of each rotation which produces an identical result. Incorporating these will cut my inputs by 1/8th. With the glider, my aim is for all of these inputs to be treated exactly the same. Will this have to be done with pre-processing, or can I incorporate it into the network?

Aric
  • 275
  • 1
  • 6

3 Answers3

4

If I understand well your single output node will be the next status of the square in the middle. You don't need to worry about the number of nodes in the hidden layers while you have sufficient resources to train the model. This problem is very easy to learn for a neural network so no size concerns.

You need to do supervised training that means you need to feed in input data and the matching expected output. You need to be sure that in your training data all 4 rotations are assigned to the same output. This way your network should learn to treat all of these the same way.

You made me curious so I tried myself. My solution could learn 100% correct in about 20 epochs running within a few seconds on my old laptop. I only slightly changed the output to be categorical either [0,1] or [1,0] but that gives the same result that you are looking for. Just for reference here is the code written in python:

from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from keras import optimizers
from keras.utils.np_utils import to_categorical
import helper

x_,y_ = helper.fnn_csv_toXY("conway.csv","output",False)
y_binary = to_categorical(y_)

model = Sequential()
model.add(Dense(100, activation='relu', kernel_initializer='glorot_uniform',input_shape =(9,)))
model.add(Dense(20, activation='relu', kernel_initializer='glorot_uniform'))
model.add(Dense(2, activation='softmax'))
adam=optimizers.Adam()
model.compile(optimizer=adam,
              loss='categorical_crossentropy',
              metrics=['acc'])
model.fit(x_, y_binary, epochs=100)
Manngo
  • 296
  • 1
  • 5
4

You have identified an optimization in your problem space and desire to bake this into your neural net. I suggest preprocessing: Compose your optimization with a neural net that does a subset of what you want.

In other words, normalize your input by manually coding a rotation algorithm that rotates inputs to capture the equivalence highlighted in your post. Then feed the output of this transformation to your neural net, for training and all other uses. This means you are training the neural net to tackle the sub-problem you identified - rotations are redundant.

Test your normalizer by generating random input, rotating it to all four potential transformations, run the normalizer on each one, then check that they are all equivalent.

Harrichael
  • 221
  • 2
  • 3
1

To be purist about it, begin with considering the input differently, as a circular array of size four, each item containing a pair of bits, and additionally a center bit:

... 01, 01, 11, 10 ...

0

Throughout the design of the network, continue this circular structure and center point paradigm.

Douglas Daseeco
  • 7,543
  • 1
  • 28
  • 63