Tag: generalization

Deep Neural Networks as Iterated Function Systems and a Generalization Bound

Deep Neural Networks as Iterated Function Systems and a Generalization Bound arXiv:2601.19958v1 Announce Type: new Abstract: Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis. Architecturally, DNNs rely on the recursive…

January 29, 2026
Towards A Unified PAC-Bayesian Framework for Norm-based Generalization Bounds

Towards A Unified PAC-Bayesian Framework for Norm-based Generalization Bounds arXiv:2601.08100v1 Announce Type: new Abstract: Understanding the generalization behavior of deep neural networks remains a fundamental challenge in modern statistical learning theory. Among existing approaches, PAC-Bayesian norm-based bounds have demonstrated particular promise due to their data-dependent nature and their ability to capture algorithmic and geometric properties…

January 14, 2026
Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression arXiv:2512.09275v1 Announce Type: new Abstract: Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer’s generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer Transformer under in-context…

December 11, 2025
Revisiting Theory of Contrastive Learning for Domain Generalization

Revisiting Theory of Contrastive Learning for Domain Generalization arXiv:2512.02831v1 Announce Type: new Abstract: Contrastive learning is among the most popular and powerful approaches for self-supervised representation learning, where the goal is to map semantically similar samples close together while separating dissimilar ones in the latent space. Existing theoretical methods assume that downstream task classes are…

December 3, 2025
Generalization Below the Edge of Stability: The Role of Data Geometry

Generalization Below the Edge of Stability: The Role of Data Geometry arXiv:2510.18120v1 Announce Type: new Abstract: Understanding generalization in overparameterized neural networks hinges on the interplay between the data geometry, neural architecture, and training dynamics. In this paper, we theoretically explore how data geometry controls this implicit bias. This paper presents theoretical results for overparameterized…

October 22, 2025
Overfitting has a limitation: a model-independent generalization error bound based on R’enyi entropy

Overfitting has a limitation: a model-independent generalization error bound based on R’enyi entropy arXiv:2506.00182v1 Announce Type: new Abstract: Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization error, which is the impact of overfitting. Understanding generalization error behavior of increasingly large-scale…

June 3, 2025
Continuous Domain Generalization

Continuous Domain Generalization arXiv:2505.13519v1 Announce Type: new Abstract: Real-world data distributions often shift continuously across multiple latent factors such as time, geography, and socioeconomic context. However, existing domain generalization approaches typically treat domains as discrete or evolving along a single axis (e.g., time), which fails to capture the complex, multi-dimensional nature of real-world variation. This…

May 21, 2025
Generalization Analysis for Contrastive Representation Learning under Non-IID Settings

Generalization Analysis for Contrastive Representation Learning under Non-IID Settings arXiv:2505.04937v1 Announce Type: new Abstract: Contrastive Representation Learning (CRL) has achieved impressive success in various domains in recent years. Nevertheless, the theoretical understanding of the generalization behavior of CRL is limited. Moreover, to the best of our knowledge, the current literature only analyzes generalization bounds under…

May 9, 2025
Generalization in Federated Learning: A Conditional Mutual Information Framework

Generalization in Federated Learning: A Conditional Mutual Information Framework arXiv:2503.04091v1 Announce Type: new Abstract: Federated Learning (FL) is a widely adopted privacy-preserving distributed learning framework, yet its generalization performance remains less explored compared to centralized learning. In FL, the generalization error consists of two components: the out-of-sample gap, which measures the gap between the empirical…

March 7, 2025
Generalization Bounds for Equivariant Networks on Markov Data

Generalization Bounds for Equivariant Networks on Markov Data arXiv:2503.00292v1 Announce Type: new Abstract: Equivariant neural networks play a pivotal role in analyzing datasets with symmetry properties, particularly in complex data structures. However, integrating equivariance with Markov properties presents notable challenges due to the inherent dependencies within such data. Previous research has primarily concentrated on establishing…

March 4, 2025
Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity

Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity arXiv:2501.08679v1 Announce Type: cross Abstract: This paper introduces a diagonal adaptive kernel model that dynamically learns kernel eigenvalues and output coefficients simultaneously during training. Unlike fixed-kernel methods tied to the neural tangent kernel theory, the diagonal adaptive kernel model adapts…

January 16, 2025
Training a neural netwok for data reduction and better generalization

Training a neural netwok for data reduction and better generalization arXiv:2411.17180v1 Announce Type: new Abstract: The motivation for sparse learners is to compress the inputs (features) by selecting only the ones needed for good generalization. Linear models with LASSO-type regularization achieve this by setting the weights of irrelevant features to zero, effectively identifying and ignoring…

November 27, 2024