For questions about training networks, rules systems, or other AI system components.
Questions tagged [training]
509 questions
25
votes
3 answers
How do I choose the optimal batch size?
Batch size is a term used in machine learning and refers to the number of training examples utilised in one iteration. The batch size
can be one of three options:
batch mode: where the batch size is equal to the total dataset thus making the…
Sebastian Nielsen
- 401
- 1
- 4
- 11
16
votes
5 answers
Why are the initial weights of neural networks randomly initialised?
This might sound silly to someone who has plenty of experience with neural networks but it bothers me...
Random initial weights might give you better results that would be somewhat closer to what a trained neural network should look like, but it…
Matas Vaitkevicius
- 271
- 5
- 12
15
votes
4 answers
Can some one help me understand this paragraph from Nvidia's progressive GAN paper?
In the paper Progressive growing of gans for improved quality, stability, and variation (ICLR, 2018) by Nvidia researchers, the authors write
Furthermore, we observe that mode collapses traditionally
plaguing GANs tend to happen very quickly, over…
Inkplay_
- 421
- 4
- 8
14
votes
3 answers
How to train a neural network for a round based board game?
I'm wondering how to train a neural network for a round based board game like, tic-tac-toe, chess, risk or any other round based game.
Getting the next move by inference seems to be pretty straight forward, by feeding the game state as input and…
soriak
- 249
- 1
- 2
- 3
13
votes
2 answers
Which layer in a CNN consumes more training time: convolution layers or fully connected layers?
In a convolutional neural network, which layer consumes more training time: convolution layers or fully connected layers?
We can take AlexNet architecture to understand this. I want to see the time breakup of the training process. I want a relative…
Ruchit Dalwadi
- 335
- 3
- 11
13
votes
2 answers
Can you train a neural network by simply giving it ratings each time it runs?
I am currently trying to train a bot for a game I am creating. It is a 2d game with a complex map made of various shapes. The bot and character shoot bullets that are capable of ricocheting. The neural network outputs a vector in which the bot will…
Beluker
- 133
- 1
- 5
13
votes
3 answers
Is it possible to train a neural network to estimate a vehicle's length?
I have a large dataset (over 100k samples) of vehicles with the ground truth of their lengths.
Is it possible to train a deep network to measure/estimate vehicle length?
I haven't seen any papers related to estimating object size using a deep neural…
Naji
- 139
- 1
- 1
- 3
13
votes
2 answers
How are generative adversarial networks trained?
I am reading about generative adversarial networks (GANs) and I have some doubts regarding it. So far, I understand that in a GAN there are two different types of neural networks: one is generative ($G$) and the other discriminative ($D$). The…
Eka
- 1,106
- 8
- 24
12
votes
1 answer
What are the best known gradient-free training methods for deep learning?
As I know, the current state of the art methods for training deep learning networks are variants of gradient descent or stochastic gradient descent.
What are the best known gradient-free training methods for deep learning (mostly in visual tasks…
rkellerm
- 334
- 1
- 9
11
votes
7 answers
Why does training an SVM take so long? How can I speed it up?
I'm trying to create and test non-linear SVMs with various kernels (RBF, Sigmoid, Polynomial) in scikit-learn, to create a model which can classify anomalies and benign behaviors.
My dataset includes 692703 records and I use a 75/25%…
Panagiotis
- 211
- 1
- 2
- 3
11
votes
3 answers
What size of neural networks can be trained on current consumer grade GPUs? (1060,1070,1080)
Is it possible to give a rule of thumb estimate about the size of neural networks that are trainable on common consumer-grade GPUs?
For example, the Emergence of Locomotion (Reinforcement) paper trains a network using tanh activation of the neurons.…
pascalwhoop
- 305
- 1
- 8
10
votes
2 answers
How can I encode angle data to train neural networks?
I am training a neural network where the target data is a vector of angles in radians (between $0$ and $2\pi$).
I am looking for study material on how to encode this data.
Can you supply me with a book or research paper that covers this topic…
user366312
- 341
- 1
- 13
9
votes
3 answers
Is a GPU always faster than a CPU for training neural networks?
Currently, I am working on a few projects that use feedforward neural networks for regression and classification of simple tabular data. I have noticed that training a neural network using TensorFlow-GPU is often slower than training the same…
GKozinski
- 1,290
- 11
- 22
9
votes
4 answers
What could an oscillating training loss curve represent?
I tried to create a simple model that receives an $80 \times 130$ pixel image. I only had 35 images and 10 test images. I trained this model for a binary classification task. The architecture of the model is described below.
conv2d_1 (Conv2D) …
Krishnakumar
- 91
- 1
- 1
- 2
8
votes
2 answers
Can LSTM neural networks be sped up by a GPU?
I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be…
Dieshe
- 289
- 1
- 2
- 6