Questions tagged [keras]

For questions related to Keras, the modular neural networks library written in Python. However, note that programming questions are off-topic here.

See: Keras Documentation

267 questions
37
votes
6 answers

Why do CNN's sometimes make highly confident mistakes, and how can one combat this problem?

I trained a simple CNN on the MNIST database of handwritten digits to 99% accuracy. I'm feeding in a bunch of handwritten digits, and non-digits from a document. I want the CNN to report errors, so I set a threshold of 90% certainty below which my…
22
votes
2 answers

Why would you implement the position-wise feed-forward network of the transformer with convolution layers?

The Transformer model introduced in "Attention is all you need" by Vaswani et al. incorporates a so-called position-wise feed-forward network (FFN): In addition to attention sub-layers, each of the layers in our encoder and decoder contains a…
8
votes
2 answers

Can LSTM neural networks be sped up by a GPU?

I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be…
Dieshe
  • 289
  • 1
  • 2
  • 6
8
votes
1 answer

Validation accuracy higher than training accurarcy

I implemented the unet in TensorFlow for the segmentation of MRI images of the thigh. I noticed I always get a higher validation accuracy by a small gap, independently of the initial split. One example: So I researched when this could be…
Lis Louise
  • 139
  • 4
8
votes
2 answers

Effect of batch size and number of GPUs on model accuracy

I have a data set that was split using a fixed random seed and I am going to use 80% of the data for training and the rest for validation. Here are my GPU and batch size configurations use 64 batch size with one GTX 1080Ti use 128 batch size with…
bit_scientist
  • 241
  • 2
  • 5
  • 16
7
votes
1 answer

Deep Q-Learning poor convergence on Stochastic Environment

I'm trying to implement a Deep Q-network in Keras/TF that learns to play Minesweeper (our stochastic environment). I have noticed that the agent learns to play the game pretty well with both small and large board sizes. However, it only…
7
votes
1 answer

Why does 'loss' change depending on the number of epochs chosen?

I am using Keras to train different NN. I would like to know why if I increment the epochs in 1, the result until the new epoch is not the same. I am using shuffle=False, and np.random.seed(2017), and I have check that if I repeat with the same…
6
votes
2 answers

Two data classes for a convolutional neural network, can one have a LOT more images for training than the other?

I have two classes in the training set: one that has images with a feature and the other of images without that feature. Can there be a LOT more images with "no feature" so I can fit in all possible false positives?
Vasya T
  • 69
  • 1
6
votes
1 answer

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems?

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems? If we take, for example, the travelling salesman problem (or the dominating set problem). Let's say I have a bunch of smaller examples, where I…
6
votes
1 answer

How to graphically represent a RNN architecture implemented in Keras?

I'm trying to create a simple blogpost on RNNs, that should give a better insight into how they work in Keras. Let's say: model = keras.models.Sequential() model.add(keras.layers.SimpleRNN(5, return_sequences=True, input_shape=[None,…
6
votes
3 answers

Why are traditional ML models still used over deep neural networks?

I'm still on my first steps in the Data Science field. I played with some DL frameworks, like TensorFlow (pure) and Keras (on top) before, and know a little bit of some "classic machine learning" algorithms like decision trees, k-nearest neighbors,…
Douglas Ferreira
  • 845
  • 1
  • 9
  • 13
5
votes
1 answer

How to constraint the output value of a neural network?

I am training a deep neural network. There is a constraint on the output value of the neural network (e.g. the output has to be between 0 and 180). I think some possible solutions are using sigmoid, tanh activation at the end of the layer. Are there…
5
votes
0 answers

What are the ways to calculate the error rate of a deep Convolutional Neural Network, when the network produces different results using the same data?

I am new to the object recognition community. Here I am asking about the broadly accepted ways to calculate the error rate of a deep CNN when the network produces different results using the same data. 1. Problem introduction Recently I was trying…
5
votes
1 answer

Over- and underestimations of the lowest and highest values in LSTM network

I'm training an LSTM network with multiple inputs and several LSTM layers in order to set up a time series gap filling procedure. The LSTM is trained bidirectionally with "tanh" activation on the outputs of the LSTM, and one Dense layer with…
5
votes
1 answer

How does backpropagation work on a custom loss function whose components have magnitudes of different orders?

I want to use a custom loss function which is a weighted combination of l1 and DSSIM losses. The DSSIM loss is limited between 0 and 0.5 where as the l1 loss can be orders of magnitude greater and is so in my case. How does backpropagation work in…
1
2 3
17 18