Highest Voted 'notation' Questions - Artificial Intelligence Stack Exchange

11

votes

1 answer

What is the meaning of $V(D,G)$ in the GAN objective function?

Here is the GAN objective function. $$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log…

asked Apr 12 '19 at 20:53

i_rezic

245
1
6

8

votes

1 answer

How is the policy gradient calculated in REINFORCE?

Reading Sutton and Barto, I see the following in describing policy gradients: How is the gradient calculated with respect to an action (taken at time t)? I've read implementations of the algorithm, but conceptually I'm not sure I understand how the…

reinforcement-learning policy-gradients sutton-barto notation reinforce

asked Apr 21 '19 at 19:23

Hanzy

519
3
11

7

votes

3 answers

How are the reward functions $R(s)$, $R(s, a)$ and $R(s, a, s')$ equivalent?

In this video, the lecturer states that $R(s)$, $R(s, a)$ and $R(s, a, s')$ are equivalent representations of the reward function. Intuitively, this is the case, according to the same lecturer, because $s$ can be made to represent the state and the…

reinforcement-learning markov-decision-process proofs notation reward-functions

asked Feb 07 '19 at 15:38

nbro

42,615
12
119
217

5

votes

1 answer

What does the argmax of the expectation of the log likelihood mean?

What does the following equation mean? What does each part of the formula represent or mean? $$\theta^* = \underset {\theta}{\arg \max} \Bbb E_{x \sim p_{data}} \log {p_{model}(x|\theta) }$$

machine-learning math probability notation expectation

asked Jan 28 '18 at 11:15

arash moradi

181
1
6

5

votes

1 answer

What does the notation $\mathcal{N}(z; \mu, \sigma)$ stand for in statistics?

I know that the notation $\mathcal{N}(\mu, \sigma)$ stands for a normal distribution. But I'm reading the book "An Introduction to Variational Autoencoders" and in it, there is this notation: $$\mathcal{N}(z; 0, I)$$ What does it mean? picture of…

terminology variational-autoencoder notation random-variable bayesian-statistics

asked Aug 23 '20 at 17:49

Peyman

624
1
6
14

5

votes

2 answers

Why are the value functions sometimes written with capital letters and other times with lower-case letters?

Why are the state-value and action-value functions are sometimes written in small letters and other times in capitals? For instance, why in the Q-learning algorithm (page 131 of Barto and Sutton's book but not only), we the capitals are used $Q(S,…

reinforcement-learning value-functions notation

asked Jun 10 '20 at 02:46

d56

243
1
7

5

votes

2 answers

What is a probability distribution in machine learning?

If we were learning or working in the machine learning field, then we frequently come across the term "probability distribution". I know what probability, conditional probability, and probability distribution/density in math mean, but what is its…

machine-learning terminology definitions probability-distribution notation

asked Nov 28 '19 at 03:58

Eka

1,106
8
24

5

votes

1 answer

What is the meaning of the square brackets in ant colony optimization?

I'm studying the paper "Minimizing Total Tardiness on a Single Machine Using Ant Colony Optimization" which has proposed to use Ant colony optimization to SMTWTP. According to this paper: Each artificial ant iteratively and independently decides…

papers swarm-intelligence notation ant-colony-optimization

asked Nov 01 '19 at 12:59

Pablo

283
1
5

5

votes

1 answer

Understanding the equation of TD(0) in the paper "Learning to predict by the methods of temporal differences"

In the paper Learning to predict by the methods of temporal differences (p. 15), the weights in the temporal difference learning are updated as given by the equation $$ \Delta w_t = \alpha \left(P_{t+1} - P_t\right) \sum_{k=1}^{t}{\lambda^{t-k}…

reinforcement-learning temporal-difference-methods notation

asked Jun 01 '19 at 14:41

Amanda

205
1
5

4

votes

1 answer

Notation used in paper on Continuous Time Reinforcement Learning

I am working on implementing the learning shown in this paper: https://homes.cs.washington.edu/~todorov/courses/amath579/reading/Continuous.pdf In the paper, the authors devise a continuous time learning extension of the actor critic method in…

reinforcement-learning machine-learning papers actor-critic-methods notation

asked Mar 03 '24 at 21:18

Derick Diana

41
2

4

votes

1 answer

Is the Bandit Problem an MDP?

I've read Sutton and Barto's introductory RL book. They define a policy as a mapping from states to probabilities of selecting each possible action. If the agent is following policy $\pi$ at time $t$, then $\pi(a|s)$ as the probability of taking…

reinforcement-learning comparison markov-decision-process notation multi-armed-bandits

asked Jun 15 '21 at 08:44

Snowball

225
1
7

4

votes

2 answers

Why do we use $X_{I_t,t}$ and $v_{I_t}$ to denote the reward received and the at time step $t$ and the distribution of the chosen arm $I_t$?

I'm doing some introductory research on classical (stochastic) MABs. However, I'm a little confused about the common notation (e.g. in the popular paper of Auer (2002) or Bubeck and Cesa-Bianchi (2012)). As in the latter study, let us consider an…

papers notation multi-armed-bandits upper-confidence-bound

asked Jul 16 '20 at 13:41

MAB_N00B

41
3

4

votes

1 answer

What does the term $|\mathcal{A}(s)|$ mean in the $\epsilon$-greedy policy?

I've been looking online for a while for a source that explains these computations but I can't find anywhere what does the $|A(s)|$ mean. I guess $A$ is the action set but I'm not sure about that notation: $$\frac{\varepsilon}{|\mathcal{A}(s)|}…

reinforcement-learning monte-carlo-methods notation on-policy-methods epsilon-greedy-policy

asked Jul 14 '20 at 20:11

Metrician

195
5

4

votes

1 answer

What do the subscripts mean in $N_{t,n,\sigma,L}$?

A neural network can apparently be denoted as $N_{t,n,\sigma,L}$. What do these subscripts $t, n, \sigma$ and $L$ mean? Could you link me to a paper, article or webpage with an explanation for this?

neural-networks math definitions notation

asked Nov 13 '19 at 08:03

J. Doe

143
5

4

votes

1 answer

What does $x,y \sim \hat{p}_{data}$ mean in the Deep Learning book by Goodfellow

In chapter 5 of Deep Learning book of Ian Goodfellow, some notations in the loss function as below make me really confused. I tried to understand $x,y \sim p_{data}$ means a sample $(x, y)$ sampled from original dataset distribution (or $y$ is the…

machine-learning deep-learning probability-distribution notation expectation

asked May 25 '19 at 12:02

David Ng

143
4

Questions tagged [notation]