Highest Voted 'backpropagation' Questions - Artificial Intelligence Stack Exchange

45

votes

4 answers

What is the time complexity for training a neural network using back-propagation?

Suppose that a NN contains $n$ hidden layers, $m$ training examples, $x$ features, and $n_i$ nodes in each layer. What is the time complexity to train this NN using back-propagation? I have a basic idea about how they find the time complexity of…

asked Mar 18 '18 at 11:26

user9947

19

votes

1 answer

Are these two versions of back-propagation equivalent?

Just for fun, I am trying to develop a neural network. Now, for backpropagation I saw two techniques. The first one is used here and in many other places too. What it does is: It computes the error for each output neuron. It backpropagates it into…

neural-networks machine-learning comparison backpropagation

asked Sep 04 '16 at 09:45

Aspie96

191
4

14

votes

2 answers

Is the mean-squared error always convex in the context of neural networks?

Multiple resources I referred to mention that MSE is great because it's convex. But I don't get how, especially in the context of neural networks. Let's say we have the following: $X$: training dataset $Y$: targets $\Theta$: the set of parameters…

neural-networks math backpropagation gradient-descent

asked Aug 22 '17 at 14:26

user74211

141
1
3

13

votes

1 answer

Can non-differentiable layer be used in a neural network, if it's not learned?

For example, AFAIK, the pooling layer in a CNN is not differentiable, but it can be used because it's not learning. Is it always true?

neural-networks machine-learning backpropagation gradient-descent pooling

asked Aug 24 '18 at 17:34

RedRus

163
1
5

12

votes

1 answer

Why use ReLU over Leaky ReLU?

From my understanding a leaky ReLU attempts to address issues of vanishing gradients and nonzero-centeredness by keeping neurons that fire with a negative value alive. With just this info to go off of, it would seem that the leaky ReLU is just an…

machine-learning deep-learning backpropagation gradient-descent activation-functions

asked May 24 '23 at 21:44

John Brown

123
1
1
5

12

votes

2 answers

What exactly is averaged when doing batch gradient descent?

I have a question about how the averaging works when doing mini-batch gradient descent. I think I now understood the general gradient descent algorithm, but only for online learning. When doing mini-batch gradient descent, do I have to: forward…

backpropagation gradient-descent feedforward-neural-networks stochastic-gradient-descent mini-batch-gradient-descent

asked Apr 18 '20 at 21:21

Ben

455
3
11

12

votes

5 answers

What is "backprop"?

What does "backprop" mean? Is the "backprop" term basically the same as "backpropagation" or does it have a different meaning?

neural-networks backpropagation terminology definitions

asked Aug 02 '16 at 15:39

kenorb

10,525
6
45
95

10

votes

2 answers

What are the learning limitations of neural networks trained with backpropagation?

In 1969, Seymour Papert and Marvin Minsky showed that Perceptrons could not learn the XOR function. This was solved by the backpropagation network with at least one hidden layer. This type of network can learn the XOR function. I believe I was once…

neural-networks machine-learning backpropagation universal-approximation-theorems computational-learning-theory

asked Aug 03 '16 at 18:05

S.L. Barth is on codidact.com

1,214
1
10
21

10

votes

2 answers

How do evolutionary algorithms have advantages over the conventional backpropagation methods?

How does employing evolutionary algorithms to design and train artificial neural networks have advantages over using the conventional backpropagation algorithms?

neural-networks comparison backpropagation evolutionary-algorithms

asked Aug 11 '16 at 09:39

kenorb

10,525
6
45
95

10

votes

1 answer

Is back-propagation applied for each data point or for a batch of data points?

I am new to deep learning and trying to understand the concept of back-propagation. I have a doubt about when the back-propagation is applied. Assume that I have a training data set of 1000 images for handwritten letters, Is back-propagation…

neural-networks backpropagation gradient-descent stochastic-gradient-descent mini-batch-gradient-descent

asked Apr 05 '19 at 08:34

Maanu

245
1
2
7

8

votes

3 answers

How do I know if my backpropagation is implemented correctly?

I'm working on an implementation of the backpropagation algorithm for a simple neural network, which predicts a probability of survival (1 or 0). However, I can't get it above 80%, no matter how much I try to set the right hyperparameters. I suspect…

neural-networks backpropagation algorithm-request

asked Sep 03 '17 at 12:22

Damian Matkowski

83
4

8

votes

1 answer

What do symmetric weights mean and how does it make backpropagation biologically implausible?

I was reading a paper on alternatives to backpropagation as a learning algorithm in neural networks. In this paper, the author talks about the disadvantages of backpropagation, and one of the disadvantages stated is that backpropagation requires…

neural-networks backpropagation

asked May 08 '22 at 12:53

0jas

83
4

8

votes

3 answers

How does backprop work through the random sampling layer in a variational autoencoder?

Implementations of variational autoencoders that I've looked at all include a sampling layer as the last layer of the encoder block. The encoder learns to generate a mean and standard deviation for each input, and samples from it to get the input's…

backpropagation variational-autoencoder

asked Dec 18 '21 at 22:33

Luke Wolcott

183
4

8

votes

1 answer

Which loss function should I use in REINFORCE, and what are the labels?

I understand that this is the update for the parameters of a policy in REINFORCE: $$ \Delta \theta_{t}=\alpha \nabla_{\theta} \log \pi_{\theta}\left(a_{t} \mid s_{t}\right) v_{t}, $$ where $v_t$ is usually the discounted future reward and …

reinforcement-learning backpropagation policy-gradients reinforce cross-entropy

asked Sep 16 '20 at 15:09

S2673

600
4
17

8

votes

1 answer

Why do we update all layers simultaneously while training a neural network?

Very deep models involve the composition of several functions or layers. The gradient tells how to update each parameter, under the assumption that the other layers do not change. In practice, we update all of the layers simultaneously. The above…

neural-networks deep-learning backpropagation gradient-descent batch-normalization

asked Apr 16 '20 at 06:37

stoic-santiago

1,201
9
22

Questions tagged [backpropagation]