Most Popular

1500 questions
7
votes
2 answers

Why is dropout favoured compared to reducing the number of units in hidden layers?

Why is dropout favored compared to reducing the number of units in hidden layers for the convolutional networks? If a large set of units leads to overfitting and dropping out "averages" the response units, why not just suppress units? I have read…
7
votes
3 answers

Does each filter in each convolution layer create a new image?

Say I have a CNN with this structure: input = 1 image (say, 30x30 RGB pixels) first convolution layer = 10 5x5 convolution filters second convolution layer = 5 3x3 convolution filters one dense layer with 1 output So a graph of the network will…
7
votes
1 answer

How many training data is required for GAN?

I'm beginning to study and implement GAN to generate more datasets. I'll just try to experiment with state-of-the-art GAN models as described here https://paperswithcode.com/sota/image-generation-on-cifar-10. The problem is I don't have a big…
7
votes
3 answers

Is there an open-source implementation for graph convolution networks for weighted graphs?

Currently, I'm using a Python library, StellarGraph, to implement GCN. And I now have a situation where I have graphs with weighted edges. Unfortunately, StellarGraph doesn't support those graphs I'm looking for an open-source implementation for…
7
votes
1 answer

What is predicate argument recognition?

There is a study about The Necessity of Parsing for Predicate Argument Recognition, however I couldn't find much information about 'Predicate Argument Recognition' which could explain it. What is it exactly and how does it work, briefly?
7
votes
1 answer

How could an AI detect whether an enemy in a game can be blocked off/trapped?

Imagine a game played on a 10x10 grid system where a player can move up down left or right and imagine there are two players on this grid: An enemy and you. In this game, there are walls on the grid which you can't go through. The objective of this…
Ahmed
  • 71
  • 1
7
votes
1 answer

What would the Valkyrie AI robot do on Mars?

I was reading that the Valkyrie robot was originally designed to 'carry out search and rescue missions'. However, there were some talks to send it to Mars to assist astronauts. What kind of specific trainings or tasks are planned for 'him' to be…
kenorb
  • 10,525
  • 6
  • 45
  • 95
7
votes
4 answers

Is k-fold cross-validation more effective than splitting the dataset into training and test datasets to prevent overfitting?

I want to prevent my model from overfitting. I think that k-fold cross-validation (because it is doing this each time with different datasets) may be more effective than splitting the dataset into training and test datasets to prevent overfitting,…
7
votes
3 answers

Is the Mask Needed for Masked Self-Attention During Inference with GPT-2

My understanding is that masked self-attention is necessary during training of GPT-2, as otherwise it would be able to directly see the correct next output at each iteration. My question is whether the attention mask is necessary, or even possible,…
7
votes
0 answers

How is the rollout from the MCTS implemented in both of the AlphaGo Zero and the AlphaZero algorithms?

In the vanilla Monte Carlo tree search (MCTS) implementation, the rollout is usually implemented following a uniform random policy, that is, it takes random actions until the game is finished and only then the information gathered is backed up. I…
7
votes
1 answer

What are the differences between Yolo v1 and CenterNet?

I recently read a new paper (late 2019) about a one-shot object detector called CenterNet. Apart from this, I'm using Yolo (V3) one-shot detector, and what surprised me is the close similarity between Yolo V1 and CenterNet. First, both frameworks…
7
votes
1 answer

What technologies are needed for a self-driving car?

Google, Tesla or Apple have all built or are building their own self-driving cars. As an expert in a related area, I am interested in knowing at a high level, the systems and techniques that go into self-driving cars. How easy is it for me to make…
Harsh
  • 1,325
  • 8
  • 22
7
votes
2 answers

Why do very deep non resnet architectures perform worse compared to shallower ones for the same iteration? Shouldn't they just train slower?

My understanding of the vanishing gradient problem in deep networks is that as backprop progresses through the layers the gradients become small, and thus training progresses slower. I'm having a hard time reconciling this understanding with images…
7
votes
1 answer

How do neural network topologies affect GPU/TPU acceleration?

I was thinking about different neural network topologies for some applications. However, I am not sure how this would affect the efficiency of hardware acceleration using GPU/TPU/some other chip. If, instead of layers that would be fully connected,…
7
votes
1 answer

Why evolutionary training of neural networks is not popular?

Evolutionary algorithms are mentioned in some sources as a method to train a neural network (finding weights, not hyperparameters). However, I have not heard about one practical application of such an idea yet. My question is, why is that? What are…