Most Popular
1500 questions
5
votes
1 answer
Berkeley AI Course Question on Nearly Zero Sum Games
I am drawing this question from Berkeley's AI course (also not sure if it is the correct place to ask, so I apologize ahead of time)
https://inst.eecs.berkeley.edu/~cs188/pacman/course_schedule.html
Currently, I am working on section 3's…
user2429446
- 51
- 1
5
votes
1 answer
What is the intuition behind variational inference for Bayesian neural networks?
I'm trying to understand the concept of Variational Inference for BNNs. My source is this work. The aim is to minimize the divergence between the approx. distribution and the true posterior
$$\text{KL}(q_{\theta}(w)||p(w|D) = \int q_{\theta}(w) \…
f_3464gh
- 99
- 6
5
votes
2 answers
What is the difference between a language model and a word embedding?
I am self-studying applications of deep learning on the NLP and machine translation.
I am confused about the concepts of "Language Model", "Word Embedding", "BLEU Score".
It appears to me that a language model is a way to predict the next word given…
Exploring
- 371
- 7
- 18
5
votes
1 answer
Is the 'direction' considered, when determining the branching factor in bidirectional search?
If I am correct, the branching factor is the maximum number of successors of any node.
When I am applying bidirectional search to a transition graph like this one below
If 11 is the goal state and I start going backwards, is 10 considered as a…
Artery
- 153
- 7
5
votes
1 answer
Why did the developement of neural networks stop between 50s and 80s?
In a video lecture on the development of neural networks and the history of deep learning (you can start from minute 13), the lecturer (Yann LeCunn) said that the development of neural networks stopped until the 80s because people were using the…
Daviiid
- 585
- 5
- 17
5
votes
1 answer
How would I compute the optimal state-action value for a certain state and action?
I am currently trying to learn reinforcement learning and I started with the basic gridworld application. I tried Q-learning with the following parameters:
Learning rate = 0.1
Discount factor = 0.95
Exploration rate = 0.1
Default reward = 0
The…
Rim Sleimi
- 215
- 1
- 8
5
votes
1 answer
Why does Q-learning converge under 100% exploration rate?
I am working on this assignment where I made the agent learn state-action values (Q-values) with Q-learning and 100% exploration rate. The environment is the classic gridworld as shown in the following picture.
Here are the values of my…
Rim Sleimi
- 215
- 1
- 8
5
votes
1 answer
Can CNNs be made robust to tricks where small changes cause misclassification?
I while ago I read that you can make subtle changes to an image that will ensure a good CNN will horribly misclassify the image. I believe the changes must exploit details of the CNN that will be used for classification. So we can trick a good CNN…
Ted Ersek
- 153
- 2
5
votes
2 answers
Is it practical to train AlphaZero or MuZero (for indie games) on a personal computer?
Is it practical/affordable to train an AlphaZero/MuZero engine using a residential gaming PC, or would it take thousands of years of training for the AI to learn enough to challenge humans?
I'm having trouble wrapping my head around how much…
Luke W
- 53
- 3
5
votes
1 answer
What happens if 2 genes have the same connection but a different innovation number?
I have read the Evolving Neural Networks through Augmenting Topologies (NEAT) paper, but some doubts are still bugging me, so I have two questions.
When do mutations occur? Between which nodes?
When mating, what happens if 2 genes have the same…
Miemels
- 389
- 2
- 12
5
votes
1 answer
Is (log-)standard deviation learned in TRPO and PPO or fixed instead?
After having read Williams (1992), where it was suggested that actually both the mean and standard deviation can be learned while training a REINFORCE algorithm on generating continuous output values, I assumed that this would be common practice…
Daniel B.
- 835
- 1
- 6
- 14
5
votes
1 answer
When do mutations in NEAT occur?
I read through the Evolving Neural Networks through Augmenting Topologies (NEAT) paper. I understand the algorithm now, but one thing is still unclear to me.
When does the mutation occur and how does it take place? How is it chosen whether to add a…
Miemels
- 389
- 2
- 12
5
votes
3 answers
In Q-learning, wouldn't it be better to simply iterate through all possible states?
In Q-learning, all resources I've found seem to say that the algorithm to update the Q-table should start at some initial state, and pick actions (which are sometimes random) to explore the state space.
However, wouldn't it be better/faster/more…
Kricket
- 197
- 4
5
votes
1 answer
Use ConvNet to predict bitmap
I want to build a classifier which takes an aerial image and outputs a bitmap. The bitmap is supposed to be 1 at every pixel where the aerial image has water. For this process I want to use a ConvNet but I am unsure about the output layer. I…
treigerm
- 53
- 2
5
votes
1 answer
Are there any approaches to AGI that will definitely not work?
Is there empirical evidence that some approaches to achieving AGI will definitely not work? For the purposes of the question the system should at least be able to learn and solve novel problems.
Some possible approaches:
A Prolog program
A program…
persiflage
- 153
- 4