4

In computer vision is very common to use supervised tasks, where datasets have to be manually annotated by humans. Some examples are object classification (class labels), detection (bounding boxes) and segmentation (pixel-level masks). These datasets are essentially pairs of inputs-outputs which are used to train Convolutional Neural Networks to learn the mapping from inputs to outputs, via gradient descent optimization. But animals don't need anybody to show them bounding boxes or masks on top of things in order for them to learn to detect objects and make sense of the visual world around them. This leads me to think that brains must be performing some sort of self-supervision to train themselves to see.

What does current research say about the learning paradigm used by brains to achieve such an outstanding level of visual competence? Which tasks do brains use to train themselves to be so good at processing visual information and making sense of the visual world around them? Or said in other words: how does the brain manage to train its neural networks without having access to manually annotated datasets like ImageNet, COCO, etc. (i.e. what does the brain use as ground truth, what is the loss function the brain is optimizing)? Finally, can we apply these insights in computer vision?


Update: I posted a related question on Psychology & Neuroscience StackExchange, which I think complements the question I posted here: check it out

Pablo Messina
  • 107
  • 3
  • 9

1 Answers1

3

I think you are slightly confusing 2 problems. 1 being classification of meta visual elements and the other being the visual system itself.

Our visual system, when it comes to processing information, has had billions of years of iteration(training), so that at birth(and before), we are already tuned for the processing of visual stimuli, as well as have the mechanisms to decipher objects in our spatial field of view.

These two papers(L1, L2), have a great deal of information about the evolution of our visual system and its processing. The second speculates on the connection of said evolution and the construction of "seeing systems" very interesting.

For further inquiry on this in particular, check out David Marr. He was probably the most influential early computer vision mind. He still is mentioned in many top-down AGI and computer vision research projects to this day.

hisairnessag3
  • 1,280
  • 6
  • 15