5

In a video lecture on the development of neural networks and the history of deep learning (you can start from minute 13), the lecturer (Yann LeCunn) said that the development of neural networks stopped until the 80s because people were using the wrong neurons (which were binary so discontinuous) and that is due to the slowness of multiplying floating point numbers which made the use of backpropagation really difficult.

He said, I quote, "If you have continuous neurons, you need to multiply the activation of a neuron by a weight to get a contribution to the weighted sum."

But the statement stays true even with binary (or any discontinuous activation function) neurons. Am I wrong? (at least, as long as you're in the hidden layer, the output of your neuron will be multiplied by a weight I guess). The same professor said that the perceptron, ADALINE relied on weighted sums so they were computing multiplications anyways.

I don't know what I miss here and I hope someone will enlighten me.

nbro
  • 42,615
  • 12
  • 119
  • 217
Daviiid
  • 585
  • 5
  • 17

1 Answers1

1

I will first address your main question "Why did the development of neural networks stop between 50s and 80s?" In 40-50s there was a lot of progress (McCulloch and Pitts); the perceptron was invented (Rosenblatt). That gave rise to an AI hype giving many promises (exactly like today)!

However, Minsky and Papert have proved in 1969 that a single-layer architecture is not enough to build a universal approximating machine (see e.g. Minsky, M. & Papert, S. Perceptrons: An Introduction to Computational Geometry, vol. 165 (1969)). That led to the first disappointment in "AI". Which lasted until several major breakthroughs in 1980s: the proof of universal approximating capabilities of multi-layer perceptron by Cybenko, the popularisation of backpropagation algorithm (Hinton and colleagues), etc.

I agree with LeCun that using continuous activation functions have enabled the backpropagation algorithm at the time. It is only recently that we have learned to backpropagate in networks with binary activation functions (2016!).

penkovsky
  • 294
  • 1
  • 10