Highest Voted 'sigmoid' Questions - Artificial Intelligence Stack Exchange

10

votes

3 answers

Are ReLUs incapable of solving certain problems?

Background I've been interested in and reading about neural networks for several years, but I haven't gotten around to testing them out until recently. Both for fun and to increase my understanding, I tried to write a class library from scratch in…

asked Nov 17 '16 at 20:46

Benjamin Chambers

221
1
8

9

votes

1 answer

What happens when I mix activation functions?

There are several activation functions, such as ReLU, sigmoid or $\tanh$. What happens when I mix activation functions? I recently found that Google has developed Swish activation function which is (x*sigmoid). By altering activation function can it…

neural-networks machine-learning activation-functions relu sigmoid

asked Jan 04 '19 at 13:39

JSChang

93
1
6

7

votes

1 answer

How is division by zero avoided when implementing back-propagation for a neural network with sigmoid at the output neuron?

I am building a neural network for which I am using the sigmoid function as the activation function for the single output neuron at the end. Since the sigmoid function is known to take any number and return a value between 0 and 1, this is causing…

neural-networks backpropagation cross-entropy sigmoid numerical-algorithms

asked Jun 02 '18 at 01:21

Dimitry

73
1
3

7

votes

4 answers

What does "e" do in the Sigmoid Activation Function?

Within the Sigmoid Squishification function, f(x) = 1/(1 + e^(-x)) "e" is unnecessary, as it can be replaced by any other value that is not 0 or 1. Why is "e" used here? As shown below, the function is working well without that, and in replacement,…

sigmoid

asked Aug 13 '23 at 17:20

Jake

181
4

4

votes

1 answer

Why is it a problem if the outputs of an activation function are not zero-centered?

In this lecture, the professor says that one problem with the sigmoid function is that its outputs aren't zero-centered. Are the explanation provided by the professor regarding why this is bad is that the gradient of our loss w.r.t. the weights…

backpropagation activation-functions sigmoid gradient

asked Mar 23 '21 at 07:47

Daviiid

585
5
17

4

votes

0 answers

Why does sigmoid saturation prevent signal flow through the neuron?

As per these slides on page 35: Sigmoids saturate and kill gradients. when the neuron's activation saturates at either tail of 0 or 1, the gradient at these regions is almost zero. the gradient and almost no signal will flow through the neuron…

neural-networks backpropagation weights sigmoid vanishing-gradient-problem

asked Jan 31 '21 at 20:56

EEAH

193
1
5

4

votes

1 answer

Neural network doesn't seem to converge with ReLU but it does with Sigmoid?

I'm not really sure if this is the sort of question to ask on here, since it is less of a general question about AI and more about the coding of it, however I thought it wouldn't fit on stack overflow. I have been programming a multilayer perceptron…

convergence relu c++ sigmoid

asked Apr 15 '20 at 18:37

finlay morrison

151
4

4

votes

1 answer

Can neural networks with a sigmoid as the activation function of the output layer approximate continuous functions?

Neural networks are commonly used for classification tasks, in fact from this post it seems like that's where they shine brightest. However, when we want to classify using neural networks, we often have the output layer to take values in $[0,1]$;…

neural-networks classification definitions computational-learning-theory sigmoid

asked Mar 23 '20 at 08:43

AB_IM

634
1
7
15

3

votes

3 answers

Why is there tanh(x)*sigmoid(x) in a LSTM cell?

CONTEXT I was wondering why there are sigmoid and tanh activation functions in an LSTM cell. My intuition was based on the flow of tanh(x)*sigmoid(x) and the derivative of tanh(x)*sigmoid(x) It seems to me that authors wanted to choose such a…

neural-networks long-short-term-memory activation-functions sigmoid tanh

asked Nov 24 '21 at 07:59

MASTER OF CODE

242
2
9

3

votes

1 answer

Accuracy dropped when I ran the program the second time

I was following a tutorial about Feed-Forward Networks and wrote this code for a simple FFN : class FirstFFNetwork: #intialize the parameters def __init__(self): self.w1 = np.random.randn() self.w2 = np.random.randn() self.w3 =…

neural-networks deep-learning feedforward-neural-networks accuracy sigmoid

asked May 05 '20 at 03:44

Eeshaan Jain

31
2

2

votes

1 answer

How do I avoid the "math domain error" when the input to the log is zero in the objective function of a neural network?

I am implementing a neural network to train it on handwritten digits. Here is the cost function that I am implementing. $$J(\Theta)=-\frac{1}{m} \sum_{i=1}^{m} \sum_{k=1}^{K}\left[y_{k}^{(i)} \log…

neural-networks machine-learning objective-functions implementation sigmoid

asked Jan 22 '18 at 17:00

Gokulakannan

73
5

2

votes

2 answers

When should I use a linear unit instead of sigmoid in the output layer?

In which types of learning tasks are linear units more useful than sigmoid activation functions in the output layer of a multi-layer neural network?

machine-learning deep-learning comparison activation-functions sigmoid

asked Dec 31 '24 at 22:59

DSPinfinity

1,223
4
10

2

votes

2 answers

How would you go from 1 to k hidden layers in Cybenko's result that neural networks are universal approximators?

Cybenko showed that if $\sigma$ is a sigmoidal, continuous function, then for any $\varepsilon > 0$, for any continuous function $f: [0, 1]^d \to \mathbb{R}$, there exists a function of the form $g:x \mapsto \sum\limits_{i = 1}^n a_i\sigma\left(…

neural-networks sigmoid universal-approximation-theorems

asked Nov 08 '24 at 21:12

JackEight

123
3

2

votes

1 answer

If the output is 0.09, does this mean that the prediction is class 1 or 0?

I use a Keras EfficientNetB7 and transfer learning to solve a binary classification problem. I use tf.keras.layers.Dense(1, activation="sigmoid")(x) for my final layer. My labels are encoded as the following for the model.fit(): [[1.] [1.] [0.] …

machine-learning tensorflow keras binary-classification sigmoid

asked Dec 30 '23 at 18:03

Doug

125
3

2

votes

1 answer

How do sigmoid functions make it so that the prediction $\hat{y}$ indicates the probability that the observed value, $y$, is $1$?

I am currently studying the textbook Neural Networks and Deep Learning by Charu C. Aggarwal. Chapter 1.2.1.3 Choice of Activation and Loss Functions says the following: The choice of activation function is a critical part of neural network design.…

neural-networks activation-functions perceptron sigmoid

asked Jun 11 '21 at 10:52

The Pointer

611
5
22

Questions tagged [sigmoid]