Most Popular
1500 questions
6
votes
1 answer
Which neural networks are suitable for visual place recognition?
I am doing a project on visual place recognition in changing environments. The CNN used here is mostly AlexNet, and a feature vector is constructed from layer 3.
Does anyone know of similar work using other CNNs, for example, VGGnet (which I am…
Daniel Wong
- 69
- 2
6
votes
2 answers
Should the actor or actor-target model be used to make predictions after training is complete (DDPG)?
The situation
I am referring to the paper T. P. Lillicrap et al, "Continuous control with deep reinforcement learning" where they discuss deep learning in the context of continuous action spaces ("Deep Deterministic Policy Gradient").
Based on the…
a_guest
- 161
- 3
6
votes
1 answer
Find anomalies from records of categorical data
I have a data-set with $m$ observations and $p$ categorical variables (nominal), each variable $X_1, X_2,\dots, X_p$ has several different possible values.
Ultimately, I am looking for a way to find anomalies i.e. to identify rows for which the…
bat
- 61
- 1
6
votes
2 answers
What is the current state of AGI development?
Could you please provide some insight into the current stage of developments in AGI area? Are there any projects that had breakthroughs recently? Maybe some news source to follow on this topic?
Alex
- 347
- 2
- 11
6
votes
2 answers
How can I formulate the map colouring problem as a hill climbing search problem?
I have a map. I need to colour it with $k$ colours, such that two adjacent regions do not share a colour.
How can I formulate the map colouring problem as a hill climbing search problem?
jrk
- 105
- 2
- 5
6
votes
1 answer
How to implement exploration function and learning rate in Q Learning
I'm trying to implement Q-learning (state-based representation and no neural / deep stuff) but I'm having a hard time getting it to learn anything.
I believe my issue is with the exploration function and/or learning rate. Thing is, I see different…
SaldaVonSchwartz
- 199
- 2
- 5
6
votes
1 answer
How are the kernels initialized in a convolutional neural network?
I am currently learning about CNNs. I am confused about how filters (aka kernels) are initialized.
Suppose that we have a $3 \times 3$ kernel. How are the values of this filter initialized before training? Do you just use predefined image kernels?…
Inkplay_
- 421
- 4
- 8
6
votes
1 answer
What is a weighted average in a non-stationary k-armed bandit problem?
In the book Reinforcement Learning: An Introduction (page 25), by Richard S. Sutton and Andrew G. Barto, there is a discussion of the k-armed bandit problem, where the expected reward from the bandits changes slightly over time (that is, the problem…
chessprogrammer
- 3,050
- 2
- 16
- 26
6
votes
2 answers
Neural network for data visualization
At my work, we're currently doing some research into data visualisation for highly interconnected data, basically graphs.
We've been implementing all sorts of different layouts and trying to see which fits best, but, due to the nature of the problem…
tiansivive
- 171
- 1
6
votes
1 answer
Is one big network faster than several small ones?
The basis of my question is that a CNN that does great on MNIST is far smaller than a CNN that does great on ImageNet. Clearly, as the number of potential target classes increases, along with image complexity (background, illumination, etc.), the…
pshlady
- 484
- 2
- 7
6
votes
1 answer
Is there any theoretical work on representation in machine learning?
In AI, how knowledge is represented is a crucial topic. In traditional knowledge representation and reasoning, substantial work has focused on logic-based and graph-based knowledge representation methods. In contrast, in machine learning, knowledge…
nova
- 180
- 6
6
votes
4 answers
Unclear points regarding ROC curve in machine learning
Consider the ROC curve below:
It is not clear to me why "PERFECT CLASSIFIER" line is really perfect because there are points on this line (see the green circles) for which false positive rate is non-zero.
DSPinfinity
- 1,223
- 4
- 10
6
votes
3 answers
Why doesn't SiLU suffer from a worse version of a "dying ReLU" problem?
Unlike ReLU, the derivative of SiLU is non-zero everywhere except at the global minimum. However, intuitively it seems like having a negative gradient when the input is very negative should be even worse than having a zero gradient, since the…
llllvvuu
- 208
- 1
- 7
6
votes
1 answer
Why is the denominator ignored in the Bayes' rule?
The naïve Bayes' generative algorithm is often represented by the following formula:
$$\text{argmax}_{y} p(y|x) = \text{argmax}_y \frac{p(x|y)p(y)}{p(x)} \approx \text{argmax}_y p(x|y)p(y)$$
Why do we have $p(x)=1$ which allows the approximation…
gcorso
- 366
- 1
- 8
6
votes
5 answers
How does an activation function's derivative measure error rate in a neural network?
A blog post called "Text Classification using Neural Networks" states that the derivative of the output of a sigmoid function is used to measure error rates.
What is the rationale for this?
I thought the derivative of a sigmoid function output is…
tim_xyz
- 301
- 3
- 9