For questions related to statistics in the context of artificial intelligence and, in particular, machine learning. Note that there is a Stack Exchange website completely dedicated to statistics, namely, Cross Validated Stack Exchange.
Questions tagged [statistics]
39 questions
4
votes
0 answers
When do two identical neural networks have uncorrelated errors?
In Chapter 9, section 9.1.6, Raul Rojas describes how committees of networks can reduce the prediction error by training N identical neural networks and averaging the results.
If $f_i$ are the functions approximated by the $N$ neural nets,…
EmmanuelMess
- 227
- 3
- 16
4
votes
1 answer
What is the difference between model and data distributions?
Is there any difference between the model distribution and data distribution, or are they the same?
Bhuwan Bhatt
- 404
- 2
- 13
3
votes
1 answer
Why has statistics-based AI become more popular than other forms of AI?
What is the fundamental reason that statistics-based AI (e.g., ML and Neural Net) has become more popular than other forms of AI, e.g., Fuzzy Logic and rules-based AI (e.g., Prolog)?
user366312
- 341
- 1
- 13
3
votes
1 answer
If I can repeat ML experiments, how can I bound my results?
It has been asked here if we should repeat lengthy experiments.
Let's say I can repeat them, how should I present them? For instance, if I am measuring the accuracy of a model on test data during some training epochs, and I repeat various times this…
biofa.esp
- 31
- 2
3
votes
1 answer
What is the most statistically acceptable method for tuning neural network hyperparameters on very small datasets?
Neural networks are usually evaluated by dividing a dataset into three splits:
training,
validation, and
test
The idea is that critical hyperparameters of the network such as the number of epochs and the learning rate can be tuned by testing the…
Mike NZ
- 421
- 2
- 6
3
votes
0 answers
What is meant by subspace clustering in MFA?
The basic idea of MFA is to perform subspace clustering by assuming the covariance structure for each component of the form, $\Sigma_i = \Lambda_i \Lambda_i^T + \Psi_i$, where $\Lambda_i \in \mathbb{R}^{D\times d}$, is the factor loadings matrix…
stoic-santiago
- 1,201
- 9
- 22
3
votes
1 answer
Research paths/areas for improving the performance of CNNs when faced with limited data
I've been reading through the research literature for image processing, computer vision, and convolutional neural networks. For image classification and object recognition, I know that convolutional neural networks deliver state-of-the-art…
The Pointer
- 611
- 5
- 22
3
votes
1 answer
How does $\mathbb{E}$ suddenly change to $\mathbb{E}_{\pi'}$ in this equation?
In Sutton-Barto's book on page 63 (81 of the pdf):
$$\mathbb{E}[R_{t+1} + \gamma v_\pi(S_{t+1}) \mid S_t=s,A_t=\pi'(s)] = \mathbb{E}_{\pi'}[R_{t+1} + \gamma v_\pi(S_{t+1}) \mid S_{t} = s]$$
How does $\mathbb{E}$ suddenly change to…
ZERO NULLS
- 147
- 10
2
votes
2 answers
What is a "multinomial model" in machine learning?
What is the mathematical definition of "multinomial model" in machine learning?
I will be happy for a good definition plus an example.
DSPinfinity
- 1,223
- 4
- 10
2
votes
2 answers
How does a VGG-based Style-Loss incorporate color information?
I've recently been reading a lot about style transfer, its applications and implications. I understand what the Gram matrix is and does. I can program it. But one thing that has been boggling me is: how does the VGG style loss incorporate color…
masterBroesel
- 121
- 1
2
votes
1 answer
What does it mean when a model "statistically outperforms" another?
I was reading this paper where they are stating the following:
We also use the T-Test to test the significance of GMAN in 1 hour ahead prediction compared to Graph WaveNet. The p-value is less than 0.01, which demonstrates that GMAN statistically…
razvanc92
- 1,158
- 1
- 9
- 18
2
votes
0 answers
What is the relationship between PAC learning and classic parameter estimation theorems?
What are the differences and similarities between PAC learning and classic parameter estimation theorems (e.g. consistency results when estimating parameters, e.g. with MLE)?
FourierFlux
- 847
- 1
- 7
- 17
1
vote
1 answer
Does anyone use Statistical Energy to monitor generative AI training?
Statistical Energy (Szekely & Rizzo, 2013 or Aslan & Zech, 2005) can be used as a statistical test of whether two distributions are the same or different. It works particularly well on high dimensional datasets where other methods like the…
tkw954
- 111
- 2
1
vote
0 answers
Resulting quantiles from Quantile Regression DQN
In my QR-DQN application, the resulting quantiles for a state s and action a take the form of the blue line in the figure. The method works well in expected values and trains effectively. However, I know that in my problem the return distribution…
amavrits
- 11
- 1
1
vote
3 answers
What is the difference between q and p in Statistical Notation(used in VAE)?
I'm looking at general visuals of Variational Autoencoders and I'm seeing that the encoder is typically expressed as q(z|x) with phi as a subscript while the decoder is p(x|z) with theta as a subscript. I've seen the use of p and q in other papers…
Kiran Manicka
- 113
- 6