1

I'm attempting to train a convolutional neural net to perform binary classification of volumes of shape $(H, W, C)$ (i.e., height, width, channels). For the sake of this example, let's say that the volumes represent RGB images and that $C = 3$.

My data are formatted such that pixel values are mutually exclusive across channels. In other words, if the pixel at $H_0, W_0, C_i > 0$, then, all $H_0, W_0, C_{j \neq i} = 0$. Said another way, for a given pixel location $H_i, W_i$, if the green channel value is greater than 0, the red and blue channel values are always 0.

The model itself consists of multiple 2D convolutional layers (each followed by max pooling operations), followed by two fully-connected layers.

Of course, I could always just collapse the channels before training by taking the sum of pixel values across channels (creating an image of shape $(H, W, 1)$). However, I would like to keep the channels separate as they represent very different characteristics of the image. Basically, I'd like the network to "learn" the arrangement of non-zero values across the channels. I've also considered doing a $1 \times 1$ convolution across the multi-channel input as the first step in the model, followed by my various CONV-POOL operations.

As far as I can tell, this seems like an atypical approach, so I'm wondering if there are obvious issues with training the CNN on mutually-exclusive multi-channel images? Does it even make sense to do this? If available, I'd love to know of any reading material or resources about training CNNs on multi-channel images with mutually exclusive channels. Thanks!

tomsasani
  • 11
  • 2

0 Answers0