Most Popular

1500 questions
6
votes
1 answer

Is a Sobel filter for edge detection a type of Cellular Neural Network?

I have implemented a Sobel filter for edge detection in Matlab without using its toolbox. I am a bit confused: Is a Sobel filter a type of Cellular Neural Network? Both Sobel and Cellular Neural Network calculate output via its neighborhood cells.
6
votes
1 answer

What is the “Hello World” problem of Unsupervised Learning?

As a followup to this question, I'm interested in what the typical "Hello World" problem (first easy example problem) is for unsupervised learning. A quick Google search didn't find any obvious answers for me.
6
votes
2 answers

How does A* search work given there are multiple goal states?

When I have read through the fundamentals of AI, I saw a situation (i.e., a search space) which is illustrated in the following picture. These are the heuristic estimates: h(B)=9 h(D)=10 h(A)=2 h(C)=1 If we use the A* algorithm, the node $B$ will…
hellojoshhhy
  • 163
  • 1
  • 3
6
votes
2 answers

Why don't we use auto-encoders instead of GANs?

I have watched Stanford's lectures about artificial intelligence, I currently have one question: why don't we use autoencoders instead of GANs? Basically, what GAN does is it receives a random vector and generates a new sample from it. So, if we…
6
votes
1 answer

Why are neural networks preferred to other classification functions optimized by gradient decent

Consider a neural network, e.g. as presented by Nielsen here. Abstractly, we just construct some function $f: \mathbb{R}^n \to [0,1]^m$ for some $n,m \in \mathbb{N}$ (i.e. the dimensions of the input and output space) that depends on a large set of…
6
votes
3 answers

Is there a logical method of deducing an optimal batch size when training a Deep Q-learning agent with experience replay?

I am training an RL agent using Deep-Q learning with experience replay. At each frame, I am currently sampling 32 random transitions from a queue which stores a maximum of 20000 and training as described in the Atari with Deep RL paper. All is…
user192361237
  • 225
  • 2
  • 6
6
votes
2 answers

Given two neural networks that compute two functions $f(x)$ and $g(x)$, how can I create a neural network that computes $f(x)g(x)$?

I have two functions $f(x)$ and $g(x)$, and each of them can be computed with a neural network $\phi_f$ and $\phi_g$. My question is, how can I write a neural net for $f(x)g(x)$? So, for example, if $g(x)$ is constant and equal to $c$ and $\phi_f =…
6
votes
1 answer

What does the number of required expert demonstrations in Imitation Learning depend on?

I just read the following points about the number of required expert demonstrations in imitation learning, and I'd like some clarifications. For the purpose of context, I'll be using a linear reward function throughout this post (i.e. the reward can…
6
votes
1 answer

What are the pros and cons of sparse and dense rewards in reinforcement learning?

From what I understand, if the rewards are sparse the agent will have to explore more to get rewards and learn the optimal policy, whereas if the rewards are dense in time, the agent is quickly guided towards its learning goal. Are the above…
6
votes
2 answers

My Deep Q-Learning Network does not learn for OpenAI gym's cartpole problem

I am implementing OpenAI gym's cartpole problem using Deep Q-Learning (DQN). I followed tutorials (video and otherwise) and learned all about it. I implemented a code for myself and I thought it should work, but the agent is not learning. I will…
SJa
  • 393
  • 3
  • 17
6
votes
1 answer

Why does TD Learning require Markovian domains?

One of my friends and I were discussing the differences between Dynamic Programming, Monte-Carlo, and Temporal Difference (TD) Learning as policy evaluation methods - and we agreed on the fact that Dynamic Programming requires the Markov assumption…
6
votes
1 answer

Which paper introduced the term "softmax"?

Nowadays, the softmax function is widely used in deep learning and, specifically, classification with neural networks. However, the origins of this term and function are almost never mentioned anywhere. So, which paper introduced this term?
nbro
  • 42,615
  • 12
  • 119
  • 217
6
votes
2 answers

Why does shifting all the rewards have a different impact on the performance of the agent?

I am new to reinforcement learning. For my application, I have found out that if my reward function contains some negative and positive values, my model does not give the optimal solution, but the solution is not bad as it still gives positive…
6
votes
3 answers

What is Reinforcement Learning?

What is the cleanest, easiest way to explain someone who is a non-STEM work colleague the concept of Reinforcement Learning? What are the main ideas behind Reinforcement Learning?
Pluviophile
  • 1,293
  • 7
  • 20
  • 40
6
votes
1 answer

How do I recognise a bandit problem?

I'm having difficulty understanding the distinction between a bandit problem and a non-bandit problem. An example of the bandit problem is an agent playing $n$ slot machines with the goal of discovering which slot machine is the most probable to…