{"id":5869,"date":"2025-08-06T07:02:51","date_gmt":"2025-08-06T07:02:51","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/08\/06\/2508-03688\/"},"modified":"2025-08-06T07:02:51","modified_gmt":"2025-08-06T07:02:51","slug":"2508-03688","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/08\/06\/2508-03688\/","title":{"rendered":"Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws"},"content":{"rendered":"<p>    Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>arXiv:2508.03688v1 Announce Type: new<br \/>\nAbstract: We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime, where the data is generated as $y propto sum_{j=1}^{r}lambda_j sigmaleft(langle boldsymbol{theta_j}, boldsymbol{x}rangleright), boldsymbol{x} sim N(0,boldsymbol{I}_d)$, $sigma$ is the 2nd Hermite polynomial, and $lbraceboldsymbol{theta}_j rbrace_{j=1}^{r} subset mathbb{R}^d$ are orthonormal signal directions. We consider the extensive-width regime $r asymp d^beta$ for $beta in [0, 1)$, and assume a power-law decay on the (non-negative) second-layer coefficients $lambda_jasymp j^{-alpha}$ for $alpha geq 0$. We present a sharp analysis of the SGD dynamics in the feature learning regime, for both the population limit and the finite-sample (online) discretization, and derive scaling laws for the prediction risk that highlight the power-law dependencies on the optimization time, sample size, and model width. Our analysis combines a precise characterization of the associated matrix Riccati differential equation with novel matrix monotonicity arguments to establish convergence guarantees for the infinite-dimensional effective dynamics.<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    G&#8217;erard Ben Arous, Murat A. Erdogdu, N. Mert Vural, Denny Wu<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/arxiv.org\/abs\/2508.03688\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws arXiv:2508.03688v1 Announce Type: new Abstract: We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime, where the data is generated as $y propto sum_{j=1}^{r}lambda_j sigmaleft(langle boldsymbol{theta_j}, boldsymbol{x}rangleright), boldsymbol{x} sim N(0,boldsymbol{I}_d)$, [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,113,112],"tags":[1883,793,199],"class_list":["post-5869","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-cs-lg","category-stat-ml","tag-boldsymbol","tag-dynamics","tag-learning"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/5869"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=5869"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/5869\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=5869"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=5869"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=5869"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}