Most Popular

1500 questions
5
votes
1 answer

Berkeley AI Course Question on Nearly Zero Sum Games

I am drawing this question from Berkeley's AI course (also not sure if it is the correct place to ask, so I apologize ahead of time) https://inst.eecs.berkeley.edu/~cs188/pacman/course_schedule.html Currently, I am working on section 3's…
5
votes
1 answer

What is the intuition behind variational inference for Bayesian neural networks?

I'm trying to understand the concept of Variational Inference for BNNs. My source is this work. The aim is to minimize the divergence between the approx. distribution and the true posterior $$\text{KL}(q_{\theta}(w)||p(w|D) = \int q_{\theta}(w) \…
5
votes
2 answers

What is the difference between a language model and a word embedding?

I am self-studying applications of deep learning on the NLP and machine translation. I am confused about the concepts of "Language Model", "Word Embedding", "BLEU Score". It appears to me that a language model is a way to predict the next word given…
5
votes
1 answer

Is the 'direction' considered, when determining the branching factor in bidirectional search?

If I am correct, the branching factor is the maximum number of successors of any node. When I am applying bidirectional search to a transition graph like this one below If 11 is the goal state and I start going backwards, is 10 considered as a…
Artery
  • 153
  • 7
5
votes
1 answer

Why did the developement of neural networks stop between 50s and 80s?

In a video lecture on the development of neural networks and the history of deep learning (you can start from minute 13), the lecturer (Yann LeCunn) said that the development of neural networks stopped until the 80s because people were using the…
5
votes
1 answer

How would I compute the optimal state-action value for a certain state and action?

I am currently trying to learn reinforcement learning and I started with the basic gridworld application. I tried Q-learning with the following parameters: Learning rate = 0.1 Discount factor = 0.95 Exploration rate = 0.1 Default reward = 0 The…
5
votes
1 answer

Why does Q-learning converge under 100% exploration rate?

I am working on this assignment where I made the agent learn state-action values (Q-values) with Q-learning and 100% exploration rate. The environment is the classic gridworld as shown in the following picture. Here are the values of my…
5
votes
1 answer

Can CNNs be made robust to tricks where small changes cause misclassification?

I while ago I read that you can make subtle changes to an image that will ensure a good CNN will horribly misclassify the image. I believe the changes must exploit details of the CNN that will be used for classification. So we can trick a good CNN…
5
votes
2 answers

Is it practical to train AlphaZero or MuZero (for indie games) on a personal computer?

Is it practical/affordable to train an AlphaZero/MuZero engine using a residential gaming PC, or would it take thousands of years of training for the AI to learn enough to challenge humans? I'm having trouble wrapping my head around how much…
Luke W
  • 53
  • 3
5
votes
1 answer

What happens if 2 genes have the same connection but a different innovation number?

I have read the Evolving Neural Networks through Augmenting Topologies (NEAT) paper, but some doubts are still bugging me, so I have two questions. When do mutations occur? Between which nodes? When mating, what happens if 2 genes have the same…
5
votes
1 answer

Is (log-)standard deviation learned in TRPO and PPO or fixed instead?

After having read Williams (1992), where it was suggested that actually both the mean and standard deviation can be learned while training a REINFORCE algorithm on generating continuous output values, I assumed that this would be common practice…
5
votes
1 answer

When do mutations in NEAT occur?

I read through the Evolving Neural Networks through Augmenting Topologies (NEAT) paper. I understand the algorithm now, but one thing is still unclear to me. When does the mutation occur and how does it take place? How is it chosen whether to add a…
5
votes
3 answers

In Q-learning, wouldn't it be better to simply iterate through all possible states?

In Q-learning, all resources I've found seem to say that the algorithm to update the Q-table should start at some initial state, and pick actions (which are sometimes random) to explore the state space. However, wouldn't it be better/faster/more…
5
votes
1 answer

Use ConvNet to predict bitmap

I want to build a classifier which takes an aerial image and outputs a bitmap. The bitmap is supposed to be 1 at every pixel where the aerial image has water. For this process I want to use a ConvNet but I am unsure about the output layer. I…
5
votes
1 answer

Are there any approaches to AGI that will definitely not work?

Is there empirical evidence that some approaches to achieving AGI will definitely not work? For the purposes of the question the system should at least be able to learn and solve novel problems. Some possible approaches: A Prolog program A program…
persiflage
  • 153
  • 4