2

arXiv:2205.15549 claims that the machine learning community misunderstood VC (Vapnik–Chervonenkis) theory and VC-theoretical understandings are sufficient to understand the double descent phenomenon.

The matter seems fairly weird, since I see that ICLR rejected the paper, but the paper is now published in Journal IEEE Transactions on Neural Networks and Learning Systems. And then the paper in another form got published in Journal Neural Networks.

From what I see, ICLR reviewers and the heads in charge suggest that the paper is strongly flawed, but journal reviewers seemed to believe that the authors had some points. I am unsure what the current state of affairs is on this paper.

If this paper is correct, then it provides a simple and elegant understanding of double descent, so it would be great if anyone can point out what is really going on here.

1 Answers1

1

The authors have valid points for claims that what many see as a “mysterious” second descent in test error as model complexity increases can be derived from a refined understanding of the central VC-theoretic generalization bound within structural risk management (SRM) without invoking new paradigms beyond classical statistics and learning theory, provided one interprets the VC-dimension complexity of a NN model class correctly in terms of the norm squared of weights instead of their apparent count or graph architecture conventionally.

This perspective is attractive because it offers a simple and elegant explanation for double descent, but it remains controversial. While its VC-theoretic explanation of double descent is intriguing, the community remains divided as the ultimate validity of the claim hinges on whether VC theory’s worst-case nature can be fully consistent with the claimed data-and-algorithm-dependent VC-dimension complexity measure of a NN model class.

cinch
  • 11,000
  • 3
  • 8
  • 17