Highest Voted 'derivative' Questions - Artificial Intelligence Stack Exchange

5

votes

1 answer

Is PyTorch's `grad_fn` for a non-differentiable function that function's inverse?

What is grad_fn for a non-differentiable function like slicing (grad_fn=), view (grad_fn=), etc.? Is grad_fn simply the function's inverse operation? Where in the source code can I see the implementation of…

asked Feb 13 '25 at 00:18

Geremia

555
1
5
12

4

votes

1 answer

Why is my derivation of the back-propagation equations inconsistent with Andrew Ng's slides from Coursera?

I am using the cross-entropy cost function to calculate its derivatives using different variables $Z, W$ and $b$ at different instances. Please refer image below for calculation. As per my knowledge, my derivation is correct for $dZ, dW, db$ and…

math backpropagation derivative numpy

asked Mar 18 '20 at 21:17

learner

151
5

3

votes

1 answer

How is the max function differentiable wrt multiple arguments?

I recently came across an answer on StackOverflow that mentioned the max function being differentiable with respect to its values. From my current understanding of mathematics, I'm struggling to comprehend how this is possible. Could someone help…

neural-networks deep-learning backpropagation derivative max-pooling

asked May 25 '23 at 05:19

Peyman

624
1
6
14

3

votes

2 answers

Why does critical points and stationary points are used interchangeably?

Consider the following paragraph from Numerical Computation of the deep learning book. When $f'(x) = 0$, the derivative provides no information about which direction to move. Points where $f'(x)$ = 0 are known as critical points, or stationary…

terminology books derivative

asked Aug 20 '21 at 23:37

hanugm

4,102
3
29
63

2

votes

0 answers

What is the dimensionality of these derivatives in the paper "Active Learning for Reward Estimation in Inverse Reinforcement Learning"?

I'm trying to implement in code part of the following paper: Active Learning for Reward Estimation in Inverse Reinforcement Learning. I'm specifically referring to section 2.3 of the paper. Let's define $\mathcal{X}$ as the set of states, and…

reinforcement-learning papers rewards inverse-rl derivative

asked Jan 26 '21 at 10:13

ИванКарамазов

141
5

1

vote

0 answers

Neural Networks that fit vector transforms

I have a CNN that is image to image and maps a binary image input to a binary image output. These are usually simple shapes, like a rectangle or a circle. Usually they become smoothed a bit (the effect of lithography.) However, for my workflow I…

neural-networks convolutional-neural-networks derivative vectors

asked Aug 30 '24 at 18:57

R S

11
1

1

vote

2 answers

Direct formula for calculating the optimum matrix which minimizes the perceptron error

Suppose we have a perceptron without bias and $f(x) = x$ as activation function and matrices $X,Y,W$ that input training data are columns of matrix $X$, $Y$ is targets matrix (columns are ordered with attention to the related inputs) and $W$ is the…

neural-networks machine-learning perceptron derivative single-layer-perceptron

asked Jan 03 '24 at 10:08

hasanghaforian

113
5

1

vote

0 answers

In MLP, to calculate the delta, do I need to calculate the derivative of the cost function? Or can I just use the cost function result?

In Multi Layer Perceptron networks, I have a question: in the formula to calculate the error in the output layer, Some articles say it is like this "deltaOutput = (predict - expected) * derivativeOutput". More in other sources, I saw that they…

neural-networks backpropagation derivative

asked Dec 30 '23 at 16:41

will The J

267
1
6

1

vote

0 answers

Neural network learns to mimic distribution of classes in dataset instead of using signal from input

I'm trying to implement example from a classic AI paper named "Learning representations by back-propagating errors" by Hinton et al. Example aims at training network able to predict third term in triples of (person_0, relationship, person_1) across…

machine-learning training backpropagation derivative

asked Nov 16 '23 at 21:02

Jan Grzybek

11
1

1

vote

2 answers

How the vector-space isomorphism between $\mathbb{R}^{m \times n}$ and $\mathbb{R}^{mn}$ guarantees reshaping matrices to vectors?

Consider the following paragraph from section 5.4 Gradients fo Matrices of the chapter Vector Calculus from the textbook titled Mathematics for Machine Learning by Marc Peter Deisenroth et al. Since matrices represent linear mappings, we can…

machine-learning calculus derivative vector-space

asked Nov 30 '21 at 22:47

hanugm

4,102
3
29
63

1

vote

2 answers

What does it mean "having Lipschitz continuous derivatives"?

We can enforce some constraints on functions used in deep learning in order to guarantee optimizations. You can find it in Numerical Computation of the deep learning book. In the context of deep learning, we sometimes gain some guarantees…

deep-learning math derivative

asked Aug 22 '21 at 06:38

hanugm

4,102
3
29
63

1

vote

0 answers

BlackOut - ICLR 2016: need help understanding the cost function derivative

In the ICLR 2016 paper BlackOut: Speeding up Recurrent Neural Network Language Models with very Large Vocabularies, on page 3, for eq. 4: $$ J_{ml}^s(\theta) = log \ p_{\theta}(w_i | s) $$ They have shown the gradient computation in the subsequent…

papers objective-functions calculus derivative

asked Feb 12 '21 at 16:21

anurag

151
1
7

0

votes

1 answer

Why does "in-place" mutation cause automatic differentiation to fail, and how to write code to avoid this problem

I work with a few different automatic differentiation frameworks, including pytorch, Jax, and Flux in Julia. Periodically I run some code and I get errors about mutations or operations occurring "in-place." These errors generally cause the program…

backpropagation derivative

asked Feb 12 '24 at 07:43

krishnab

207
2
8

0

votes

1 answer

What is the correct partial derivative of $Y^c$ with respect to $A_{ij}^{kc}$?

I have a question about the Grad-CAM++ paper. I do not understand how the following equation (10) for the alphas is obtained: $$ \alpha_{ij}^{kc} = \frac{\frac{\partial^2 Y^c}{(\partial A_{ij}^k)^2}} {2\frac{\partial^2 Y^c}{(\partial A_{ij}^k)^2} …

deep-learning papers math derivative grad-cam++

asked May 09 '22 at 17:35

mlerma54

141
5

0

votes

1 answer

What is the rigorous and formal definition for the direction pointed by a gradient?

Consider the following definition of derivative from the chapter named Vector Calculus from the test book titled Mathematics for Machine Learning by Marc Peter Deisenroth et al. Definition 5.2 (Derivative). More formally, for $h>0$ the…

definitions gradient calculus derivative

asked Nov 06 '21 at 23:39

hanugm

4,102
3
29
63

Questions tagged [derivative]