Tag: relu
-
Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon
Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon arXiv:2506.20779v1 Announce Type: new Abstract: We study the implicit bias of flatness / low (loss) curvature and its effects on generalization in two-layer overparameterized ReLU networks with multivariate inputs — a problem well motivated by the minima stability and…
-
Global Minimizers of $ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks
Global Minimizers of $ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks arXiv:2505.21791v1 Announce Type: new Abstract: Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield these solutions? This paper addresses the challenge of…
-
Convergence of Adam in Deep ReLU Networks via Directional Complexity and Kakeya Bounds
Convergence of Adam in Deep ReLU Networks via Directional Complexity and Kakeya Bounds arXiv:2505.15013v1 Announce Type: new Abstract: First-order adaptive optimization methods like Adam are the default choices for training modern deep neural networks. Despite their empirical success, the theoretical understanding of these methods in non-smooth settings, particularly in Deep ReLU networks, remains limited. ReLU…
-
Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks
Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks arXiv:2412.20355v1 Announce Type: new Abstract: This paper addresses the problems of conditional variance estimation and confidence interval construction in nonparametric regression using dense networks with the Rectified Linear Unit (ReLU) activation function. We present a residual-based framework for conditional variance estimation, deriving nonasymptotic bounds…