I'm wondering why many model architectures use binary mask reconstruction for segmentational CNNs, and not regression of mask polygon coordinates? Many object detectors use regression to find coordinates of bounding boxes.
Asked
Active
Viewed 47 times
1 Answers
1
The reason is probably that the segmentation polygons have various shapes and complexities. You don't know how many points you need per polygon so defining a proper output that specifies the polygons is not straight forward. In contrast, bounding boxes are always defined by 4 coordinates (2 coordinates for lower left and upper right corner are already enough).
Chillston
- 1,878
- 7
- 13