Near-optimal estimates for the $ell^p$-Lipschitz constants of deep random ReLU neural networks

arXiv:2506.19695v1 Announce Type: new
Abstract: This paper studies the $ell^p$-Lipschitz constants of ReLU neural networks $Phi: mathbb{R}^d to mathbb{R}$ with random parameters for $p in [1,infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network’s width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the $ell^p$-Lipschitz constant varies significantly between the regimes $ p in [1,2) $ and $ p in [2,infty] $. For $p in [2,infty]$, the $ell^p$-Lipschitz constant behaves similarly to $Vert gVert_{p’}$, where $g in mathbb{R}^d$ is a $d$-dimensional standard Gaussian vector and $1/p + 1/p’ = 1$. In contrast, for $p in [1,2)$, the $ell^p$-Lipschitz constant aligns more closely to $Vert g Vert_{2}$.

Sjoerd Dirksen, Patrick Finke, Paul Geuchen, Dominik St"oger, Felix Voigtlaender

Go to original source