Tag: permutation

  • One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing

    One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing arXiv:2512.13892v1 Announce Type: new Abstract: Reliable estimation of feature contributions in machine learning models is essential for trust, transparency and regulatory compliance, especially when models are proprietary or otherwise operate as black boxes. While permutation-based methods are a standard tool for this…

  • Transformers, Time Series, and the Myth of Permutation Invariance

    Transformers, Time Series, and the Myth of Permutation Invariance There’s a common misconception in ML/DL that Transformers shouldn’t be used for forecasting because attention is permutation-invariant. Latest evidence shows the opposite, such as Google’s latest model, where the experiments show the model performs just as well with or without positional embeddings. You can find an…

  • Predictable Compression Failures: Why Language Models Actually Hallucinate

    Predictable Compression Failures: Why Language Models Actually Hallucinate arXiv:2509.11208v1 Announce Type: new Abstract: Large language models perform near-Bayesian inference yet violate permutation invariance on exchangeable data. We resolve this by showing transformers minimize expected conditional description length (cross-entropy) over orderings, $mathbb{E}_pi[ell(Y mid Gamma_pi(X))]$, which admits a Kolmogorov-complexity interpretation up to additive constants, rather than the…

  • Analyzing the Role of Permutation Invariance in Linear Mode Connectivity

    Analyzing the Role of Permutation Invariance in Linear Mode Connectivity arXiv:2503.06001v1 Announce Type: new Abstract: It was empirically observed in Entezari et al. (2021) that when accounting for the permutation invariance of neural networks, there is likely no loss barrier along the linear interpolation between two SGD solutions — a phenomenon known as linear mode…