{"id":4310,"date":"2025-06-03T07:02:30","date_gmt":"2025-06-03T07:02:30","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/06\/03\/2506-00182\/"},"modified":"2025-06-03T07:02:30","modified_gmt":"2025-06-03T07:02:30","slug":"2506-00182","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/06\/03\/2506-00182\/","title":{"rendered":"Overfitting has a limitation: a model-independent generalization error bound based on R&#8217;enyi entropy"},"content":{"rendered":"\n<div>Overfitting has a limitation: a model-independent generalization error bound based on R&#8217;enyi entropy<\/div>\n<p> \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>arXiv:2506.00182v1 Announce Type: new<br \/>\nAbstract: Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization error, which is the impact of overfitting. Understanding generalization error behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization error applicable to algorithms whose outputs are determined solely by the data&#8217;s histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the R&#8217;enyi entropy of the data-generating distribution, suggesting that a small generalization error can be maintained even with arbitrarily large models, provided the data quantity is sufficient relative to this entropy. This framework offers a direct explanation for the phenomenon where generalization performance degrades significantly upon injecting random noise into data, where the performance degrade is attributed to the consequent increase in the data distribution&#8217;s R&#8217;enyi entropy. Furthermore, we adapt the no-free-lunch theorem to be data-distribution-dependent, demonstrating that an amount of data corresponding to the R&#8217;enyi entropy is indeed essential for successful learning, thereby highlighting the tightness of our proposed generalization bound.<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Atsushi Suzuki<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/arxiv.org\/abs\/2506.00182\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overfitting has a limitation: a model-independent generalization error bound based on R&#8217;enyi entropy arXiv:2506.00182v1 Announce Type: new Abstract: Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization error, which is the impact of overfitting. Understanding generalization error behavior of increasingly large-scale [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,414,113,415,190,112,191],"tags":[84,2168,119],"class_list":["post-4310","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-cs-it","category-cs-lg","category-math-it","category-math-st","category-stat-ml","category-stat-th","tag-data","tag-error","tag-generalization"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/4310"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=4310"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/4310\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=4310"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=4310"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=4310"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}