Why instance segmentation architectures using reconstruction masks but not regression?

Question

I'm wondering why many model architectures use binary mask reconstruction for segmentational CNNs, and not regression of mask polygon coordinates? Many object detectors use regression to find coordinates of bounding boxes.

score 1 · Accepted Answer · answered Jul 14 '22 at 14:28

The reason is probably that the segmentation polygons have various shapes and complexities. You don't know how many points you need per polygon so defining a proper output that specifies the polygons is not straight forward. In contrast, bounding boxes are always defined by 4 coordinates (2 coordinates for lower left and upper right corner are already enough).

Why instance segmentation architectures using reconstruction masks but not regression?

1 Answers1