Tag: probability

  • An efficient, accurate, and interpretable machine learning method for computing probability of failure

    An efficient, accurate, and interpretable machine learning method for computing probability of failure arXiv:2601.21089v1 Announce Type: new Abstract: We introduce a novel machine learning method called the Penalized Profile Support Vector Machine based on the Gabriel edited set for the computation of the probability of failure for a complex system as determined by a threshold…

  • Generative modeling of conditional probability distributions on the level-sets of collective variables

    Generative modeling of conditional probability distributions on the level-sets of collective variables arXiv:2512.17374v1 Announce Type: new Abstract: Given a probability distribution $mu$ in $mathbb{R}^d$ represented by data, we study in this paper the generative modeling of its conditional probability distributions on the level-sets of a collective variable $xi: mathbb{R}^d rightarrow mathbb{R}^k$, where $1 le k…

  • Efficient Level-Crossing Probability Calculation for Gaussian Process Modeled Data

    Efficient Level-Crossing Probability Calculation for Gaussian Process Modeled Data arXiv:2512.12442v1 Announce Type: new Abstract: Almost all scientific data have uncertainties originating from different sources. Gaussian process regression (GPR) models are a natural way to model data with Gaussian-distributed uncertainties. GPR also has the benefit of reducing I/O bandwidth and storage requirements for large scientific simulations.…

  • High-Probability Bounds For Heterogeneous Local Differential Privacy

    High-Probability Bounds For Heterogeneous Local Differential Privacy arXiv:2510.11895v1 Announce Type: new Abstract: We study statistical estimation under local differential privacy (LDP) when users may hold heterogeneous privacy levels and accuracy must be guaranteed with high probability. Departing from the common in-expectation analyses, and for one-dimensional and multi-dimensional mean estimation problems, we develop finite sample upper…

  • Quantum-inspired probability metrics define a complete, universal space for statistical learning

    Quantum-inspired probability metrics define a complete, universal space for statistical learning arXiv:2508.21086v1 Announce Type: new Abstract: Comparing probability distributions is a core challenge across the natural, social, and computational sciences. Existing methods, such as Maximum Mean Discrepancy (MMD), struggle in high-dimensional and non-compact domains. Here we introduce quantum probability metrics (QPMs), derived by embedding probability…

  • Prediction-Powered Inference with Inverse Probability Weighting

    Prediction-Powered Inference with Inverse Probability Weighting arXiv:2508.10149v1 Announce Type: new Abstract: Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. We show that PPI can be extended to handle informative labeling by replacing…

  • Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability

    Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability arXiv:2507.17316v1 Announce Type: new Abstract: We consider the problem of estimating a discrete distribution $p$ with support of size $K$ and provide both upper and lower bounds with high probability in KL divergence. We prove that in the worst case, for any estimator $widehat{p}$,…

  • When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets

    When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets arXiv:2506.19031v1 Announce Type: new Abstract: While diffusion models generate high-quality images via probability flow, the theoretical understanding of this process remains incomplete. A key question is when probability flow converges to training samples or more general points on the data manifold.…

  • On the Wasserstein Geodesic Principal Component Analysis of probability measures

    On the Wasserstein Geodesic Principal Component Analysis of probability measures arXiv:2506.04480v1 Announce Type: new Abstract: This paper focuses on Geodesic Principal Component Analysis (GPCA) on a collection of probability distributions using the Otto-Wasserstein geometry. The goal is to identify geodesic curves in the space of probability measures that best capture the modes of variation of…

  • Kernel Quantile Embeddings and Associated Probability Metrics

    Kernel Quantile Embeddings and Associated Probability Metrics arXiv:2505.20433v1 Announce Type: new Abstract: Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful nonparametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings to represent distributions…

  • ReLU integral probability metric and its applications

    ReLU integral probability metric and its applications arXiv:2504.18897v1 Announce Type: new Abstract: We propose a parametric integral probability metric (IPM) to measure the discrepancy between two probability measures. The proposed IPM leverages a specific parametric family of discriminators, such as single-node neural networks with ReLU activation, to effectively distinguish between distributions, making it applicable in…

  • Near-optimal algorithms for private estimation and sequential testing of collision probability

    Near-optimal algorithms for private estimation and sequential testing of collision probability arXiv:2504.13804v1 Announce Type: new Abstract: We present new algorithms for estimating and testing emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. We describe an algorithm that satisfies $(alpha, beta)$-local differential privacy and…

  • Method of Moments Estimation with Python Code

    Method of Moments Estimation with Python Code Let’s say you are in a customer care center, and you would like to know the probability distribution of the number of calls per minute, or in other words, you want to answer the question: what is the probability of receiving zero, one, two, … etc., calls per…

  • Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models

    Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models arXiv:2501.18863v1 Announce Type: new Abstract: Score-based generative models, which transform noise into data by learning to reverse a diffusion process, have become a cornerstone of modern generative AI. This paper contributes to establishing theoretical guarantees for the probability flow ODE, a widely used diffusion-based…

  • Basics of Probability Notations

    Basics of Probability Notations Union, Intersection, Independence, Disjoint, Complement: Advanced Probability for Data Science Series (1) Continue reading on Towards Data Science » Sunghyun Ahn Go to original source

  • LITE: Efficiently Estimating Gaussian Probability of Maximality

    LITE: Efficiently Estimating Gaussian Probability of Maximality arXiv:2501.13535v1 Announce Type: new Abstract: We consider the problem of computing the probability of maximality (PoM) of a Gaussian random vector, i.e., the probability for each dimension to be maximal. This is a key challenge in applications ranging from Bayesian optimization to reinforcement learning, where the PoM not…

  • Model Calibration, Explained: A Visual Guide with Code Examples for Beginners

    Model Calibration, Explained: A Visual Guide with Code Examples for Beginners MODEL EVALUATION & OPTIMIZATION When all models have similar accuracy, now what? You’ve trained several classification models, and they all seem to be performing well with high accuracy scores. Congratulations! But hold on — is one model truly better than the others? Accuracy alone doesn’t tell the…

  • Method of Moments Estimation with Python Code

    Method of Moments Estimation with Python Code How to understand and implement the estimator from scratch Photo by Petr Macháček on Unsplash Let’s say you are in a customer care center, and you would like to know the probability distribution of the number of calls per minute, or in other words, you want to answer the question:…

  • Lessons from COVID-19: Why Probability Distributions Matter

    Lessons from COVID-19: Why Probability Distributions Matter Understanding Distributions with Extremes: Probability for Data Science Series (END) Continue reading on Towards Data Science » Sunghyun Ahn Go to original source

  • Probability Distributions: Poisson vs. Binomial Distribution

    Probability Distributions: Poisson vs. Binomial Distribution Using Soccer to Understand the Difference Between Poisson & Binomial: Probability for Data Science Series (3) Continue reading on Towards Data Science » Sunghyun Ahn Go to original source