Tag: noise

  • Generalized Robust Adaptive-Bandwidth Multi-View Manifold Learning in High Dimensions with Noise

    Generalized Robust Adaptive-Bandwidth Multi-View Manifold Learning in High Dimensions with Noise arXiv:2602.10530v1 Announce Type: new Abstract: Multiview datasets are common in scientific and engineering applications, yet existing fusion methods offer limited theoretical guarantees, particularly in the presence of heterogeneous and high-dimensional noise. We propose Generalized Robust Adaptive-Bandwidth Multiview Diffusion Maps (GRAB-MDM), a new kernel-based diffusion…

  • When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing

    When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing arXiv:2511.20851v1 Announce Type: new Abstract: Feature selection has remained a daunting challenge in machine learning and artificial intelligence, where increasingly complex, high-dimensional datasets demand principled strategies for isolating the most informative predictors. Despite widespread adoption, many established techniques suffer from notable limitations; some…

  • PCA recovery thresholds in low-rank matrix inference with sparse noise

    PCA recovery thresholds in low-rank matrix inference with sparse noise arXiv:2511.11927v1 Announce Type: new Abstract: We study the high-dimensional inference of a rank-one signal corrupted by sparse noise. The noise is modelled as the adjacency matrix of a weighted undirected graph with finite average connectivity in the large size limit. Using the replica method from…

  • Effects of label noise on the classification of outlier observations

    Effects of label noise on the classification of outlier observations arXiv:2511.08808v1 Announce Type: new Abstract: This study investigates the impact of adding noise to the training set classes in classification tasks using the BCOPS algorithm (Balanced and Conformal Optimized Prediction Sets), proposed by Guan & Tibshirani (2022). The BCOPS algorithm is an application of conformal…

  • Scale-Adaptive Generative Flows for Multiscale Scientific Data

    Scale-Adaptive Generative Flows for Multiscale Scientific Data arXiv:2509.02971v1 Announce Type: new Abstract: Flow-based generative models can face significant challenges when modeling scientific data with multiscale Fourier spectra, often producing large errors in fine-scale features. We address this problem within the framework of stochastic interpolants, via principled design of noise distributions and interpolation schedules. The key…

  • Noise Robust One-Class Intrusion Detection on Dynamic Graphs

    Noise Robust One-Class Intrusion Detection on Dynamic Graphs arXiv:2508.14192v1 Announce Type: cross Abstract: In the domain of network intrusion detection, robustness against contaminated and noisy data inputs remains a critical challenge. This study introduces a probabilistic version of the Temporal Graph Network Support Vector Data Description (TGN-SVDD) model, designed to enhance detection accuracy in the…

  • Generating random noise for media data

    Generating random noise for media data Hey everyone – I work on an ML team in the industry, and I’m currently building a predictive model to catch signals in live media data to sense when potential viral moments or crises are happening for brands. We have live media trackers at my company that capture all…

  • Physics constrained learning of stochastic characteristics

    Physics constrained learning of stochastic characteristics arXiv:2507.12661v1 Announce Type: new Abstract: Accurate state estimation requires careful consideration of uncertainty surrounding the process and measurement models; these characteristics are usually not well-known and need an experienced designer to select the covariance matrices. An error in the selection of covariance matrices could impact the accuracy of the…

  • Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization

    Optimal High-probability Convergence of Nonlinear SGD under Heavy-tailed Noise via Symmetrization arXiv:2507.09093v1 Announce Type: new Abstract: We study convergence in high-probability of SGD-type methods in non-convex optimization and the presence of heavy-tailed noise. To combat the heavy-tailed noise, a general black-box nonlinear framework is considered, subsuming nonlinearities like sign, clipping, normalization and their smooth counterparts.…

  • It’s Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation

    It’s Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation arXiv:2507.02275v1 Announce Type: new Abstract: Structure-agnostic causal inference studies how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions (like the impact of confounders on treatment and outcomes). Here, we find that the answer depends in a…

  • Batch Bayesian Optimization for High-Dimensional Experimental Design: Simulation and Visualization

    Batch Bayesian Optimization for High-Dimensional Experimental Design: Simulation and Visualization arXiv:2504.03943v1 Announce Type: new Abstract: Bayesian Optimization (BO) is increasingly used to guide experimental optimization tasks. To elucidate BO behavior in noisy and high-dimensional settings typical for materials science applications, we perform batch BO of two six-dimensional test functions: an Ackley function representing a needle-in-a-haystack…

  • The Art of Noise

    The Art of Noise Introduction In my last several articles I talked about generative deep learning algorithms, which mostly are related to text generation tasks. So, I think it would be interesting to switch to generative algorithms for image generation now. We knew that nowadays there have been plenty of deep learning models specialized for…

  • The Method of Moments Estimator for Gaussian Mixture Models

    The Method of Moments Estimator for Gaussian Mixture Models Audio Processing is one of the most important application domains of digital signal processing (DSP) and machine learning. Modeling acoustic environments is an essential step in developing digital audio processing systems such as: speech recognition, speech enhancement, acoustic echo cancellation, etc. Acoustic environments are filled with background…

  • Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

    Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise arXiv:2412.04648v1 Announce Type: cross Abstract: Recorrupted-to-Recorrupted (R2R) has emerged as a methodology for training deep networks for image restoration in a self-supervised manner from noisy measurement data alone, demonstrating equivalence in expectation to the supervised squared loss in the case of Gaussian noise. However, its effectiveness with non-Gaussian…

  • Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals

    Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals arXiv:2412.04736v1 Announce Type: cross Abstract: This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low. We postulate that a $p$-dimensional response series…

  • Generalized Diffusion Model with Adjusted Offset Noise

    Generalized Diffusion Model with Adjusted Offset Noise arXiv:2412.03134v1 Announce Type: new Abstract: Diffusion models have become fundamental tools for modeling data distributions in machine learning and have applications in image generation, drug discovery, and audio synthesis. Despite their success, these models face challenges when generating data with extreme brightness values, as evidenced by limitations in…