Highest Voted Questions - Artificial Intelligence Stack Exchange

6

votes

1 answer

Clarifying representation of Neural Nerwork input for Chess Alpha Zero

In the Alpha Zero paper (https://arxiv.org/pdf/1712.01815.pdf) page 13, the input for the NN is described. In the beggining of the page, the authors state that: "The input to the Neural Network is an N x X x (MT + L) image stack [...]" From this, I…

reinforcement-learning deep-rl alphazero chess

asked Feb 17 '21 at 00:05

Andrew

161
6

6

votes

2 answers

What exactly are the differences between semantic and lexical-semantic networks?

terminology definitions semantics

asked Jan 10 '17 at 13:15

idontknowwhoiamgodhelpme

161
2

6

votes

1 answer

Can the quality of randomness in neural network initialization affect model fitting?

This is a topic I have been arguing about for some time now with my colleagues, maybe you could also voice your opinion about it. Artificial neural networks use random weight initialization within a certain value range. These random parameters are…

neural-networks weights-initialization randomness

asked Feb 09 '21 at 10:43

Aki Koivu

61
3

6

votes

1 answer

Why do feedforward neural networks require the inputs to be of a fixed size, while RNNs can process variable-size inputs?

Why does a vanilla feedforward neural network only accept a fixed input size, while RNNs are capable of taking a series of inputs with no predetermined limit on the size? Can anyone elaborate on this with an example?

neural-networks recurrent-neural-networks feedforward-neural-networks multilayer-perceptrons

asked Feb 04 '21 at 16:26

Daniel

63
3

6

votes

1 answer

How to improve the reward signal when the rewards are sparse?

In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated? In a chess example, there are certain moves that you can…

reinforcement-learning reward-functions sparse-rewards delayed-rewards potential-reward-shaping

asked Feb 03 '21 at 18:17

tryingtolearn

395
1
2
10

6

votes

1 answer

What are the state space and the state transition function in AI?

I'm studying for my AI final exam, and I'm stuck in the state space representation. I understand initial and goal states, but what I don't understand is the state space and state transition function. Can someone explain what are they with…

terminology definitions search state-spaces transition-model

asked Jan 06 '17 at 15:24

İsmail Uysal

63
1
4

6

votes

1 answer

What are the advantages of RL with actor-critic methods over actor-only methods?

In general, what are the advantages of RL with actor-critic methods over actor-only (or policy-based) methods? This is not a comparison with the Q-learning series, but probably a method of learning the game with only the actor. I think it's…

reinforcement-learning comparison actor-critic-methods policy-based-methods continuous-tasks

asked Jan 12 '21 at 22:29

ground clown

111
3

6

votes

1 answer

How to express a fully connected neural network succintly using linear algebra?

I'm currently reading the paper Federated Learning with Matched Averaging (2020), where the authors claim: A basic fully connected (FC) NN can be formulated as: $\hat{y} = \sigma(xW_1)W_2$ [...] Expanding the preceding expression $\hat{y} =…

neural-networks papers feedforward-neural-networks linear-algebra federated-learning

asked Jan 05 '21 at 13:11

user1360448

83
6

6

votes

1 answer

Why do we need importance sampling?

I was studying the off-policy policy improvement method. Then I encountered importance sampling. I completely understood the mathematics behind the calculation, but I am wondering what is the practical example of importance sampling. For instance,…

reinforcement-learning monte-carlo-methods off-policy-methods importance-sampling

asked Jan 04 '21 at 01:43

Alireza Hosseini

61
3

6

votes

1 answer

What's the difference between content-based attention and dot-product attention?

I'm following this blog post which enumerates the various types of attention. It mentions content-based attention where the alignment scoring function for the $j$th encoder hidden state with respect to the $i$th context vector is the cosine…

neural-networks attention seq2seq

asked Dec 30 '20 at 10:04

Alexander Soare

1,379
3
12
28

6

votes

2 answers

What is the Bellman Equation actually telling?

What does the Bellman equation actually say? And are there many flavours of that? I get a little confused when I look for the Bellman equation, because I feel like people are telling slightly different things about what it is. And I think the…

reinforcement-learning definitions value-functions bellman-equations

asked Dec 20 '20 at 21:49

Johnny

69
1
3

6

votes

1 answer

What techniques are used to make MDP discrete state space manageable?

Generating a discretized state space for an MDP (Markov Decision Process) model seems to suffer from the curse of dimensionality. Supposed my state has a few simple features: Feeling: Happy/Neutral/Sad Feeling: Hungry/Neither/Full Food left:…

reinforcement-learning markov-decision-process state-spaces continuous-action-spaces discrete-action-spaces

asked Dec 22 '16 at 01:35

Brendan Hill

263
1
6

6

votes

1 answer

During neural network training, can gradients leak sensitive information in case training data fed is encrypted (homomorphic)?

Some algorithms in the literature allow recovering the input data used to train a neural network. This is done using the gradients (updates) of weights, such as in Deep Leakage from Gradients (2019) by Ligeng Zhu et al. In case the neural network is…

neural-networks training gradient-descent ai-security training-datasets

asked Dec 19 '20 at 22:03

witdev

73
4

6

votes

1 answer

What kind of algorithm is the Levenberg–Marquardt algorithm?

Is a Levenberg–Marquardt algorithm a type of back-propagation algorithm or is it a different category of algorithm? Wikipedia says that it is a curve fitting algorithm. How is a curve fitting algorithm relevant to a neural net?

algorithm backpropagation

asked Dec 21 '16 at 09:53

user3642

6

votes

0 answers

$\frac{P(x_1 \mid y, s = 1) \dots P(x_n \mid y, s = 1) P(y \mid s = 1)}{P(x \mid s = 1)}$ indicates that naive Bayes learners are global learners?

I am currently studying the paper Learning and Evaluating Classifiers under Sample Selection Bias by Bianca Zadrozny. In section 3. Learning under sample selection bias, the author says the following: We can separate classifier learners into two…

machine-learning terminology papers naive-bayes selection-bias

asked Dec 13 '20 at 22:15

The Pointer

611
5
22

Most Popular