Most Popular

1500 questions
19
votes
4 answers

What is the difference between self-supervised and unsupervised learning?

What is the difference between self-supervised and unsupervised learning? The terms logically overlap (and maybe self-supervised learning is a subset of unsupervised learning?), but I cannot pinpoint exactly what that difference is. What are the…
19
votes
4 answers

1 hidden layer with 1000 neurons vs. 10 hidden layers with 100 neurons

These types of questions may be problem-dependent, but I have tried to find research that addresses the question whether the number of hidden layers and their size (number of neurons in each layer) really matter or not. So my question is, does it…
Stephen Johnson
  • 1,039
  • 2
  • 9
  • 10
19
votes
4 answers

Issues with and alternatives to Deep Learning approaches?

Over the last 50 years, the rise/fall/rise in popularity of neural nets has acted as something of a 'barometer' for AI research. It's clear from the questions on this site that people are interested in applying Deep Learning (DL) to a wide variety…
NietzscheanAI
  • 7,286
  • 24
  • 38
19
votes
1 answer

Are these two versions of back-propagation equivalent?

Just for fun, I am trying to develop a neural network. Now, for backpropagation I saw two techniques. The first one is used here and in many other places too. What it does is: It computes the error for each output neuron. It backpropagates it into…
19
votes
2 answers

How do I decide the optimal number of layers for a neural network?

How do I decide the optimal number of layers for a neural network (feedforward or recurrent)?
19
votes
4 answers

What exactly is a hidden state in an LSTM and RNN?

I'm working on a project, where we use an encoder-decoder architecture. We decided to use an LSTM for both the encoder and decoder due to its hidden states. In my specific case, the hidden state of the encoder is passed to the decoder, and this…
19
votes
1 answer

Could a Boltzmann machine store more patterns than a Hopfield net?

This is from a closed beta for AI, with this question being posted by user number 47. All credit to them. According to Wikipedia, Boltzmann machines can be seen as the stochastic, generative counterpart of Hopfield nets. Both are recurrent…
Mithical
  • 2,965
  • 5
  • 28
  • 39
19
votes
3 answers

What is geometric deep learning?

What is geometric deep learning (GDL)? Here are a few sub-questions How is it different from deep learning? Why do we need GDL? What are some applications of GDL?
18
votes
4 answers

Why does the discount rate in the REINFORCE algorithm appear twice?

I was reading the book Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (complete draft, November 5, 2017). On page 271, the pseudo-code for the episodic Monte-Carlo Policy-Gradient Method is presented. Looking at…
18
votes
3 answers

Why are embeddings added, not concatenated?

Let's consider the following example from BERT I cannot understand why "the input embeddings are the sum of the token embeddings, the segmentation embeddings, and the position embeddings". The thing is, these embeddings carry different types of…
18
votes
4 answers

What makes neural networks so good at predictions?

I am new to neural-network and I am trying to understand mathematically what makes neural networks so good at classification problems. By taking the example of a small neural network (for example, one with 2 inputs, 2 nodes in a hidden layer and 2…
Aditya Gupta
  • 181
  • 3
18
votes
10 answers

How to classify data which is spiral in shape?

I have been messing around in tensorflow playground. One of the input data sets is a spiral. No matter what input parameters I choose, no matter how wide and deep the neural network I make, I cannot fit the spiral. How do data scientists fit data of…
18
votes
2 answers

What research has been done in the domain of "identifying sarcasm in text"?

Identifying sarcasm is considered one of the most difficult open-ended problems in the domain of ML and NLP/NLU. So, was there any considerable research done on that front? If yes, then what is the accuracy like? Please, also, explain the NLP model…
18
votes
2 answers

How does novelty search work?

In this article, the author claims that guiding evolution by novelty alone (without explicit goals) can solve problems even better than using explicit goals. In other words, using a novelty measure as a fitness function for a genetic algorithm works…
rcpinto
  • 2,148
  • 1
  • 17
  • 31
18
votes
4 answers

Did Minsky and Papert know that multi-layer perceptrons could solve XOR?

In their famous book entitled Perceptrons: An Introduction to Computational Geometry, Minsky and Papert show that a perceptron can't solve the XOR problem. This contributed to the first AI winter, resulting in funding cuts for neural networks.…
rcpinto
  • 2,148
  • 1
  • 17
  • 31