Most Popular

1500 questions
8
votes
3 answers

Are neural networks statistical models?

By reading the abstract of Neural Networks and Statistical Models paper it would seem that ANNs are statistical models. In contrast Machine Learning is not just glorified Statistics. I am looking for a more concise/summarized answer with focus on…
8
votes
1 answer

Suitable reward function for trading buy and sell orders

I am working to build a deep reinforcement learning agent which can place orders (i.e. limit buy and limit sell orders). The actions are {"Buy": 0 , "Do Nothing": 1, "Sell": 2}. Suppose that all the features are well suited for this task. I wanted…
8
votes
2 answers

Why are lambda returns so rarely used in policy gradients?

I've seen the Monte Carlo return $G_{t}$ being used in REINFORCE and the TD($0$) target $r_t + \gamma Q(s', a')$ in vanilla actor-critic. However, I've never seen someone use the lambda return $G^{\lambda}_{t}$ in these situations, nor in any other…
7
votes
1 answer

How to recognise metaphors in texts using NLP/NLU?

What are the current NLP/NLU techniques that can extract metaphors from texts? For example His words cut deeper than a knife. Or a simpler form like: Life is a journey that must be travelled no matter how bad the roads and accommodations.
7
votes
2 answers

How can the importance sampling ratio be different than zero when the target policy is deterministic?

In the book Reinforcement Learning: An Introduction (2nd edition) Sutton and Barto define at page 104 (p. 126 of the pdf), equation (5.3), the importance sampling ratio, $\rho _{t:T-1}$, as follows: $$\rho…
7
votes
1 answer

Why do layered neural nets struggle with continous data?

In this article here, the writer claims that a new type of neural net is required to deal with data that is both continuous, and also sparsely sampled. It was my understanding that this was the entire purpose of techniques that use neural nets, to…
Dylan
  • 171
  • 4
7
votes
1 answer

How does Hearthstone AI deal with random events

I want to learn a lot about the AI of CCG, such as Hearthstone. And now I have known one of the main algorithms that used in this kind of games, MCTS. It analyses the most promising moves, and expands the search tree based on random sampling of the…
zen
  • 73
  • 2
7
votes
4 answers

Can the mean squared error be negative?

I'm new to machine learning. I was watching a Prof. Andrew Ng's video about gradient descent from the machine learning online course. It said that we want our cost function (in this case, the mean squared error) to have the minimum value, but that…
7
votes
1 answer

Deep Q-Learning poor convergence on Stochastic Environment

I'm trying to implement a Deep Q-network in Keras/TF that learns to play Minesweeper (our stochastic environment). I have noticed that the agent learns to play the game pretty well with both small and large board sizes. However, it only…
7
votes
1 answer

What is an objective function?

Local search algorithms are useful for solving pure optimization problems, in which the aim is to find the best state according to an objective function. My question is what is the objective function?
7
votes
1 answer

Why doesn't VAE suffer mode collapse?

Mode collapse is a common problem faced by GANs. I am curious why doesn't VAE suffer mode collapse?
7
votes
1 answer

How is iterative deepening A* better than A*?

The iterative deepening A* search is an algorithm that can find the shortest path between a designated start node and any member of a set of goals. The A* algorithm evaluates nodes by combining the cost to reach the node and the cost to get from…
Huma Qaseem
  • 199
  • 1
  • 3
  • 12
7
votes
1 answer

Which Rosenblatt's paper describes Rosenblatt's perceptron training algorithm?

I struggle to find Rosenblatt's perceptron training algorithm in any of his publications from 1957 - 1961, namely: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms The perceptron: A probabilistic model for information…
7
votes
1 answer

What does the agent in reinforcement learning exactly do?

What is an agent in reinforcement learning (RL)? I think it is not the neural network behind. What does the agent in RL exactly do?
TVSuchty
  • 273
  • 4
  • 12
7
votes
1 answer

Should I be decaying the learning rate and the exploration rate in the same manner?

Should I be decaying the learning rate and the exploration rate in the same manner? What's too slow and too fast of an exploration and learning rate decay? Or is it specific from model to model?