Highest Voted 'keras' Questions - Artificial Intelligence Stack Exchange

37

votes

6 answers

Why do CNN's sometimes make highly confident mistakes, and how can one combat this problem?

I trained a simple CNN on the MNIST database of handwritten digits to 99% accuracy. I'm feeding in a bunch of handwritten digits, and non-digits from a document. I want the CNN to report errors, so I set a threshold of 90% certainty below which my…

asked Jan 28 '20 at 16:16

Alexander Soare

1,379
3
12
28

22

votes

2 answers

Why would you implement the position-wise feed-forward network of the transformer with convolution layers?

The Transformer model introduced in "Attention is all you need" by Vaswani et al. incorporates a so-called position-wise feed-forward network (FFN): In addition to attention sub-layers, each of the layers in our encoder and decoder contains a…

deep-learning keras convolution transformer feedforward-neural-networks

asked Sep 18 '19 at 23:45

Eli Korvigo

321
1
2
6

8

votes

2 answers

Can LSTM neural networks be sped up by a GPU?

I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be…

training tensorflow keras long-short-term-memory gpu

asked Jul 09 '18 at 04:55

Dieshe

289
1
2
6

8

votes

1 answer

Validation accuracy higher than training accurarcy

I implemented the unet in TensorFlow for the segmentation of MRI images of the thigh. I noticed I always get a higher validation accuracy by a small gap, independently of the initial split. One example: So I researched when this could be…

tensorflow keras image-segmentation u-net

asked Dec 27 '20 at 17:46

Lis Louise

139
4

8

votes

2 answers

Effect of batch size and number of GPUs on model accuracy

I have a data set that was split using a fixed random seed and I am going to use 80% of the data for training and the rest for validation. Here are my GPU and batch size configurations use 64 batch size with one GTX 1080Ti use 128 batch size with…

deep-learning keras accuracy gpu batch-size

asked Jan 09 '20 at 06:08

bit_scientist

241
2
5
16

7

votes

1 answer

Deep Q-Learning poor convergence on Stochastic Environment

I'm trying to implement a Deep Q-network in Keras/TF that learns to play Minesweeper (our stochastic environment). I have noticed that the agent learns to play the game pretty well with both small and large board sizes. However, it only…

reinforcement-learning deep-neural-networks keras q-learning convergence

asked Nov 17 '18 at 11:39

Sanavesa

163
1
6

7

votes

1 answer

Why does 'loss' change depending on the number of epochs chosen?

I am using Keras to train different NN. I would like to know why if I increment the epochs in 1, the result until the new epoch is not the same. I am using shuffle=False, and np.random.seed(2017), and I have check that if I repeat with the same…

neural-networks training optimization keras

asked Dec 07 '17 at 14:32

Pablo Ruiz Ruiz

179
3

6

votes

2 answers

Two data classes for a convolutional neural network, can one have a LOT more images for training than the other?

I have two classes in the training set: one that has images with a feature and the other of images without that feature. Can there be a LOT more images with "no feature" so I can fit in all possible false positives?

ai-design classification keras

asked Apr 07 '18 at 22:43

Vasya T

69
1

6

votes

1 answer

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems?

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems? If we take, for example, the travelling salesman problem (or the dominating set problem). Let's say I have a bunch of smaller examples, where I…

neural-networks deep-learning tensorflow keras graph-theory

asked Mar 10 '21 at 20:58

Jake B.

181
1

6

votes

1 answer

How to graphically represent a RNN architecture implemented in Keras?

I'm trying to create a simple blogpost on RNNs, that should give a better insight into how they work in Keras. Let's say: model = keras.models.Sequential() model.add(keras.layers.SimpleRNN(5, return_sequences=True, input_shape=[None,…

keras recurrent-neural-networks data-visualization

asked Dec 08 '20 at 09:44

Mindaugas Bernatavičius

161
3

6

votes

3 answers

Why are traditional ML models still used over deep neural networks?

I'm still on my first steps in the Data Science field. I played with some DL frameworks, like TensorFlow (pure) and Keras (on top) before, and know a little bit of some "classic machine learning" algorithms like decision trees, k-nearest neighbors,…

machine-learning deep-learning keras comparison

asked Jan 14 '20 at 13:18

Douglas Ferreira

845
1
9
13

5

votes

1 answer

How to constraint the output value of a neural network?

I am training a deep neural network. There is a constraint on the output value of the neural network (e.g. the output has to be between 0 and 180). I think some possible solutions are using sigmoid, tanh activation at the end of the layer. Are there…

neural-networks deep-learning keras activation-functions network-design

asked Oct 23 '18 at 07:48

raemoii

51
1
2

5

votes

0 answers

What are the ways to calculate the error rate of a deep Convolutional Neural Network, when the network produces different results using the same data?

I am new to the object recognition community. Here I am asking about the broadly accepted ways to calculate the error rate of a deep CNN when the network produces different results using the same data. 1. Problem introduction Recently I was trying…

deep-learning convolutional-neural-networks python keras object-recognition

asked Oct 15 '18 at 19:47

Daqi Dong

51
2

5

votes

1 answer

Over- and underestimations of the lowest and highest values in LSTM network

I'm training an LSTM network with multiple inputs and several LSTM layers in order to set up a time series gap filling procedure. The LSTM is trained bidirectionally with "tanh" activation on the outputs of the LSTM, and one Dense layer with…

neural-networks python keras recurrent-neural-networks long-short-term-memory

asked Apr 30 '18 at 10:12

Kristof

61
3

5

votes

1 answer

How does backpropagation work on a custom loss function whose components have magnitudes of different orders?

I want to use a custom loss function which is a weighted combination of l1 and DSSIM losses. The DSSIM loss is limited between 0 and 0.5 where as the l1 loss can be orders of magnitude greater and is so in my case. How does backpropagation work in…

convolutional-neural-networks backpropagation keras

asked Mar 17 '18 at 04:20

user12754

107
3

Questions tagged [keras]