Most Popular
1500 questions
8
votes
2 answers
Are there local search algorithms that make use of memory to give better solutions?
I have been studying local search algorithms such as greedy hill-climbing, stochastic hill-climbing, simulated annealing, etc. I have noticed that most of these methods take up very little memory as compared to systematic search techniques.
Are…
quantumcoder
- 81
- 4
8
votes
2 answers
How to calculate the number of parameters of a convolutional layer?
I was recently asked at an interview to calculate the number of parameters for a convolutional layer. I am deeply ashamed to admit I didn't know how to do that, even though I've been working and using CNN for years now.
Given a convolutional layer…
Ælex
- 215
- 1
- 2
- 7
8
votes
2 answers
How big artificial neural networks can we run now if our total energy budget for computation is equivalent to the human brain energy budget?
How big artificial neural networks can we run now (either with full train-backprop cycle or just evaluating network outputs) if our total energy budget for computation is equivalent to the human brain energy budget (12.6 watts)?
Let assume one cycle…
liori
- 513
- 2
- 9
8
votes
6 answers
What event would confirm that we have implemented an AGI system?
I was listening to a podcast on the topic of AGI and a guest made an argument that if strong music generation were to happen, it would be a sign of "true" intelligence in machines because of how much creative capability creating music requires (even…
Landon G
- 500
- 2
- 10
8
votes
2 answers
Should I prefer the model with the lowest validation loss or the highest validation accuracy to deploy?
I trained a ResNet20 on Cifar10 and obtained the following learning curves.
From the figures, I see at epoch 52, my validation loss is 0.323 (the lowest), and my validation accuracy is 89.7%.
On the other hand, at the end of the training (epoch…
SpiderRico
- 1,040
- 10
- 18
8
votes
1 answer
How does the generator in GAN's work?
After reading a lot of articles (for instance, this one - https://developers.google.com/machine-learning/gan/generator), I've been wondering: how does the generator in GAN's work?
What is the input to the generator? What is the meaning behind "input…
Shir K
- 183
- 4
8
votes
2 answers
Effect of batch size and number of GPUs on model accuracy
I have a data set that was split using a fixed random seed and I am going to use 80% of the data for training and the rest for validation.
Here are my GPU and batch size configurations
use 64 batch size with one GTX 1080Ti
use 128 batch size with…
bit_scientist
- 241
- 2
- 5
- 16
8
votes
1 answer
Do all neurons in a layer have the same activation function?
I'm new to machine learning (so excuse my nomenclature), and not being a python developer, I decided to jump in at the deep (no pun intended) end writing my own framework in C++.
In my current design, I have given each neuron/cell the possibility to…
lfgtm
- 230
- 2
- 8
8
votes
1 answer
Are PAC learning and VC dimension relevant to machine learning in practice?
Are PAC learning and VC dimension relevant to machine learning in practice? If yes, what is their practical value?
To my understanding, there are two hits against these theories. The first is that the results all are conditioned on knowing the…
FourierFlux
- 847
- 1
- 7
- 17
8
votes
2 answers
Is word embedding a form of feature extraction?
Feature extraction is a concept concerning the translation of raw data into the inputs that a particular machine learning algorithm requires. These derived features from the raw data that are actually relevant to tackle the underlying problem. On…
HiDDeN
- 83
- 1
- 4
8
votes
2 answers
Can a deep neural network be trained to classify an integer N1 as being divisible by another integer N2?
So I’ve been working on my own little dynamic architecture for a deep neural network (any number of hidden layers with any number of nodes in every layer) and got it solving the XOR problem efficiently. I moved on to trying to see if I could train…
bigphil
- 83
- 4
8
votes
1 answer
Selecting the right technique to predict disease from symptoms
I'm trying to come up with the right algorithm for a system in which the user enters a few symptoms and the system has to predict or determine the likelihood that a few selected symptoms are associated with those existing in the system. Then, after…
quintumnia
- 1,173
- 2
- 10
- 35
8
votes
1 answer
Does adding a constant to all rewards change the set of optimal policies in episodic tasks?
I'm taking a Coursera course on Reinforcement learning. There was a question there that wasn't addressed in the learning material: Does adding a constant to all rewards change the set of optimal policies in episodic tasks?
The answer is Yes - Adding…
Maverick Meerkat
- 422
- 4
- 11
8
votes
2 answers
What are the current big challenges in natural language processing and understanding?
I'm doing a paper for a class on the topic of big problems that are still prevalent in AI, specifically in the area of natural language processing and understanding. From what I understand, the areas:
Text classification
Entity recognition…
Landon G
- 500
- 2
- 10
8
votes
0 answers
Normalizing Normal Distributions in Thompson Sampling for online Reinforcement Learning
In my implementation of Thompson Sampling (TS) for online Reinforcement Learning, my distribution for selecting $a$ is $\mathcal{N}(Q(s, a), \frac{1}{C(s,a)+1})$, where $C(s,a)$ is the number of times $a$ has been picked in $s$.
However, I found…
Kevin
- 81
- 2