Highest Voted Questions - Artificial Intelligence Stack Exchange

16

votes

3 answers

How to implement a variable action space in Proximal Policy Optimization?

I'm coding a Proximal Policy Optimization (PPO) agent with the Tensorforce library (which is built on top of TensorFlow). The first environment was very simple. Now, I'm diving into a more complex environment, where all the actions are not available…

reinforcement-learning proximal-policy-optimization discrete-action-spaces action-spaces

asked Aug 29 '18 at 16:04

Max

163
1
6

16

votes

5 answers

What is the most general definition of "intelligence"?

When we talk about artificial intelligence, human intelligence, or any other form of intelligence, what do we mean by the term intelligence in a general sense? What would you call intelligent and what not? In other words, how do we define the term…

definitions intelligence

asked Aug 18 '18 at 14:17

user79161

379
1
13

16

votes

3 answers

What is a "trajectory" in reinforcement learning?

I'm now learning about reinforcement learning, but I just found the word "trajectory" in this answer. However, I'm not sure what it means. I read a few books on the Reinforcement Learning but none of them mentioned it. Usually these introductionary…

reinforcement-learning terminology

asked Jul 31 '18 at 14:34

Blaszard

1,097
4
11
25

16

votes

1 answer

What is the fringe in the context of search algorithms?

terminology search definitions

asked Apr 08 '18 at 14:38

tahasozgen

309
1
2
7

16

votes

5 answers

Why are the initial weights of neural networks randomly initialised?

This might sound silly to someone who has plenty of experience with neural networks but it bothers me... Random initial weights might give you better results that would be somewhat closer to what a trained neural network should look like, but it…

neural-networks training weights weights-initialization

asked Oct 21 '17 at 06:52

Matas Vaitkevicius

271
5
12

16

votes

3 answers

What roles knowledge bases play now and will play in the future?

Nowadays, artificial intelligence seems almost equal to machine learning, especially deep learning. Some have said that deep learning will replace human experts, traditionally very important for feature engineering, in this field. It is said that…

natural-language-processing knowledge-representation expert-systems symbolic-computing knowledge-base

asked Mar 05 '17 at 07:07

Lerner Zhang

1,065
1
9
22

16

votes

1 answer

What language is the GPT-3 engine written in?

I know that the API is python based, but what's the gpt-3 engine written in mostly? C? C++? I'm having some trouble finding this info.

natural-language-processing programming-languages c++ gpt-3 c

asked May 12 '21 at 00:06

Otherness

285
1
2
6

16

votes

4 answers

Why do activation functions need to be differentiable in the context of neural networks?

Why should an activation function of a neural network be differentiable? Is it strictly necessary or is it just advantageous?

neural-networks math activation-functions

asked Dec 21 '16 at 23:26

user3642

16

votes

2 answers

Why is reinforcement learning not the answer to AGI?

I previously asked a question about How can an AI freely make decisions?. I got a great answer about how current algorithms lack agency. The first thing I thought of was reinforcement learning, since the entire concept is oriented around an agent…

reinforcement-learning philosophy agi chinese-room-argument artificial-curiosity

asked Dec 13 '19 at 18:53

joethemow

405
2
7

16

votes

1 answer

Will parameter sweeping on one split of data followed by cross validation discover the right hyperparameters?

Let's call our dataset splits train/test/evaluate. We're in a situation where we require months of data. So we prefer to use the evaluation dataset as infrequently as possible to avoid polluting our results. Instead, we do 10 fold cross validation…

machine-learning deep-learning hyperparameter-optimization cross-validation generalization

asked Sep 25 '19 at 05:33

Philipp Cannons

161
6

16

votes

2 answers

How can I automate the choice of the architecture of a neural network for an arbitrary problem?

Assume that I want to solve an issue with a neural network that either I can't fit to existing architectures (perceptron, Konohen, etc) or I'm simply not aware of the existence of those or I'm unable to understand their mechanics and I rely on my…

neural-networks reference-request hyperparameter-optimization architecture neuroevolution

asked Aug 05 '16 at 21:29

Zoltán Schmidt

643
7
14

16

votes

1 answer

How to stay a up-to-date researcher in ML/RL community?

As a student who wants to work on machine learning, I would like to know how it is possible to start my studies and how to follow it to stay up-to-date. For example, I am willing to work on RL and MAB problems, but there are huge literatures on…

machine-learning reinforcement-learning research markov-decision-process

asked Jul 18 '19 at 11:54

Amin

481
2
12

16

votes

2 answers

Why is it called Latent Vector?

I just learned about GAN and I'm a little bit confused about the naming of Latent Vector. First, In my understanding, a definition of a latent variable is a random variable that can't be measured directly (we needs some calculation from other…

terminology generative-adversarial-networks

asked May 24 '19 at 07:04

malioboro

2,859
3
23
47

16

votes

3 answers

Is the optimal policy always stochastic if the environment is also stochastic?

Is the optimal policy always stochastic (that is, a map from states to a probability distribution over actions) if the environment is also stochastic? Intuitively, if the environment is deterministic (that is, if the agent is in a state $s$ and…

reinforcement-learning stochastic-policy deterministic-policy policies environment

asked Feb 15 '19 at 13:20

nbro

42,615
12
119
217

15

votes

1 answer

Why does the policy network in AlphaZero work?

In AlphaZero, the policy network (or head of the network) maps game states to a distribution of the likelihood of taking each action. This distribution covers all possible actions from that state. How is such a network possible? The possible actions…

neural-networks reinforcement-learning ai-design alphazero alphago-zero

asked Sep 14 '18 at 20:21

chessprogrammer

3,050
2
16
26

Most Popular