{"id":3182,"date":"2025-04-18T07:02:25","date_gmt":"2025-04-18T07:02:25","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/04\/18\/2504-13110\/"},"modified":"2025-04-18T07:02:25","modified_gmt":"2025-04-18T07:02:25","slug":"2504-13110","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/04\/18\/2504-13110\/","title":{"rendered":"Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time"},"content":{"rendered":"<p>    Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>arXiv:2504.13110v1 Announce Type: new<br \/>\nAbstract: We study the approximation gap between the dynamics of a polynomial-width neural network and its infinite-width counterpart, both trained using projected gradient descent in the mean-field scaling regime. We demonstrate how to tightly bound this approximation gap through a differential equation governed by the mean-field dynamics. A key factor influencing the growth of this ODE is the local Hessian of each particle, defined as the derivative of the particle&#8217;s velocity in the mean-field dynamics with respect to its position. We apply our results to the canonical feature learning problem of estimating a well-specified single-index model; we permit the information exponent to be arbitrarily large, leading to convergence times that grow polynomially in the ambient dimension $d$. We show that, due to a certain &#8220;self-concordance&#8221; property in these problems &#8212; where the local Hessian of a particle is bounded by a constant times the particle&#8217;s velocity &#8212; polynomially many neurons are sufficient to closely approximate the mean-field dynamics throughout training.<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Margalit Glasgow, Denny Wu, Joan Bruna<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/arxiv.org\/abs\/2504.13110\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time arXiv:2504.13110v1 Announce Type: new Abstract: We study the approximation gap between the dynamics of a polynomial-width neural network and its infinite-width counterpart, both trained using projected gradient descent in the mean-field scaling regime. We demonstrate how to tightly bound this approximation gap through a differential [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,113,112],"tags":[793,1049,1048],"class_list":["post-3182","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-cs-lg","category-stat-ml","tag-dynamics","tag-field","tag-mean"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3182"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=3182"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/3182\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=3182"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=3182"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=3182"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}