Most Popular

1500 questions
14
votes
4 answers

What is the relevance of AIXI on current artificial intelligence research?

From Wikipedia: AIXI ['ai̯k͡siː] is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000[1] and the results…
rcpinto
  • 2,148
  • 1
  • 17
  • 31
14
votes
3 answers

MCTS for non-deterministic games with very high branching factor for chance nodes

I'm trying to use a Monte Carlo Tree Search for a non-deterministic game. Apparently, one of the standard approaches is to model non-determinism using chance nodes. The problem for this game is that it has a very high min-entropy for the random…
Mark
  • 241
  • 2
  • 5
14
votes
3 answers

What sort of mathematical problems are there in AI that people are working on?

I recently got a 18-month postdoc position in a math department. It's a position with relative light teaching duty and a lot of freedom about what type of research that I want to do. Previously I was mostly doing some research in probability and…
LeafGlowPath
  • 261
  • 2
  • 8
14
votes
3 answers

Why does is make sense to normalize rewards per episode in reinforcement learning?

In Open AI's actor-critic and in Open AI's REINFORCE, the rewards are being normalized like so rewards = (rewards - rewards.mean()) / (rewards.std() + eps) on every episode individually. This is probably the baseline reduction, but I'm not entirely…
13
votes
4 answers

Is the singularity something to be taken seriously?

The term Singularity is often used in mainstream media for describing visionary technology. It was introduced by Ray Kurzweil in a popular book The Singularity Is Near: When Humans Transcend Biology (2005). In his book, Kurzweil gives an outlook to…
13
votes
2 answers

Which layer in a CNN consumes more training time: convolution layers or fully connected layers?

In a convolutional neural network, which layer consumes more training time: convolution layers or fully connected layers? We can take AlexNet architecture to understand this. I want to see the time breakup of the training process. I want a relative…
13
votes
1 answer

Can non-differentiable layer be used in a neural network, if it's not learned?

For example, AFAIK, the pooling layer in a CNN is not differentiable, but it can be used because it's not learning. Is it always true?
13
votes
2 answers

How important is consciousness for making advanced artificial intelligence?

How important is consciousness and self-consciousness for making advanced AIs? How far away are we from making such? When making e.g. a neural network there's (very probably) no consciousness within it, but just mathematics behind, but do we need…
13
votes
1 answer

How exactly can ReLUs approximate non-linear and curved functions?

Currently, the most commonly used activation functions are ReLUs. So I answered this question What is the purpose of an activation function in neural networks? and, while writing the answer, it struck me, how exactly can ReLUs approximate a…
user9947
13
votes
5 answers

What is the fundamental difference between CNN and RNN?

What is the fundamental difference between convolutional neural networks and recurrent neural networks? Where are they applied?
13
votes
2 answers

Can you train a neural network by simply giving it ratings each time it runs?

I am currently trying to train a bot for a game I am creating. It is a 2d game with a complex map made of various shapes. The bot and character shoot bullets that are capable of ricocheting. The neural network outputs a vector in which the bot will…
Beluker
  • 133
  • 1
  • 5
13
votes
3 answers

Is it possible to train a neural network to estimate a vehicle's length?

I have a large dataset (over 100k samples) of vehicles with the ground truth of their lengths. Is it possible to train a deep network to measure/estimate vehicle length? I haven't seen any papers related to estimating object size using a deep neural…
13
votes
4 answers

Why LLMs and RNNs learn so fast during inference but, ironically, are so slow during training?

Why LLMs learn so fast during inference, but, ironically, are so slow during training? That is, if you teach an AI a new concept in a prompt, it will learn and use the concept perfectly and flawless, through the whole prompt, after just one shot.…
13
votes
5 answers

Why is a bias parameter needed in neural networks?

I have read several resources, including previously asked questions such as this. I have also read arguments related to intercepts needed to separate linearly separable data. If my neural network can perform feature transformation, what is the need…
13
votes
5 answers

Is there a rigorous proof that AGI is possible, at least, in theory?

It is often implicitly assumed in computer science that the human mind, or at least some mechanical calculations that humans perform (see the Church-Turing thesis), can be replicated with a Turing machine, therefore Artificial General Intelligence…