2

I am trying to use a high-level semantic segmentation model (something like DeepLabv3), that takes in 2D RGB images, and then fine-tune it for my problem. However, I am working with brain MRI images which are grayscale 3D images.

The obvious solution is to split the 3D images into 2D layers and infer only on one dimension, then combine them back into a 3D representation. This will also give me the added benefit of multiplying the amount of training data I have. However, I don't like this approach for a few reasons. First, we are inherently losing important information of the brain's anatomy. In my case where I am looking for lesions in the white matter, locality can be beneficial in identifying these lesions. For example:

WML in different layers.

Splitting the slices would inevitably make it more difficult to learn how to identify these lesions. Especially, since some of those lesion voxels are usually identified by more apparent lesions that are next to them. The second issue is that it will create an intense training imbalance where most of the slices would have nothing to learn.

Being able to utilize this locality of lesions would be really beneficial I think, and I want to try to use these existing 2D models. I am a bit new to Machine Learning so I am not sure if there is an obvious solution to what I am looking for, or it is inherently impossible. What would be the best approach in migrating a 2D model to use 3D images for fine-tuning?

Later on, I also want to utilize the different MRI modalities that I have available. This adds another dimension to the problem, and I think finding a solution for the 3D issue can allow me to jump to this 4D model later on.

0 Answers0