Tag: calibration

Multiclass Calibration Assessment and Recalibration of Probability Predictions via the Linear Log Odds Calibration Function

Multiclass Calibration Assessment and Recalibration of Probability Predictions via the Linear Log Odds Calibration Function arXiv:2602.18573v1 Announce Type: new Abstract: Machine-generated probability predictions are essential in modern classification tasks such as image classification. A model is well calibrated when its predicted probabilities correspond to observed event frequencies. Despite the need for multicategory recalibration methods, existing…

February 24, 2026
Nonparametric Distribution Regression Re-calibration

Nonparametric Distribution Regression Re-calibration arXiv:2602.13362v1 Announce Type: new Abstract: A key challenge in probabilistic regression is ensuring that predictive distributions accurately reflect true empirical uncertainty. Minimizing overall prediction error often encourages models to prioritize informativeness over calibration, producing narrow but overconfident predictions. However, in safety-critical settings, trustworthy uncertainty estimates are often more valuable than narrow…

February 17, 2026
Design-marginal calibration of Gaussian process predictive distributions: Bayesian and conformal approaches

Design-marginal calibration of Gaussian process predictive distributions: Bayesian and conformal approaches arXiv:2512.05611v1 Announce Type: new Abstract: We study the calibration of Gaussian process (GP) predictive distributions in the interpolation setting from a design-marginal perspective. Conditioning on the data and averaging over a design measure mu, we formalize mu-coverage for central intervals and mu-probabilistic calibration through…

December 8, 2025
Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification

Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification arXiv:2511.20960v1 Announce Type: new Abstract: Modern artificial intelligence systems make critical decisions yet often fail silently when uncertain. We develop a geometric framework for post-hoc calibration of neural network probability outputs, treating probability vectors as points on the $(c-1)$-dimensional probability simplex equipped with the Fisher–Rao metric.…

November 27, 2025
Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization

Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization arXiv:2510.21273v1 Announce Type: new Abstract: Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably more challenging. The existing literature on multivariate calibration primarily focuses on diagnostic tools…

October 27, 2025
Calibrating Generative Models

Calibrating Generative Models arXiv:2510.10020v1 Announce Type: new Abstract: Generative models frequently suffer miscalibration, wherein class probabilities and other statistics of the sampling distribution deviate from desired values. We frame calibration as a constrained optimization problem and seek the closest model in Kullback-Leibler divergence satisfying calibration constraints. To address the intractability of imposing these constraints exactly,…

October 14, 2025
CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference

CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference arXiv:2508.17077v1 Announce Type: new Abstract: Current experimental scientists have been increasingly relying on simulation-based inference (SBI) to invert complex non-linear models with intractable likelihoods. However, posterior approximations obtained with SBI are often miscalibrated, causing credible regions to undercover true parameters. We develop $texttt{CP4SBI}$, a model-agnostic…

August 26, 2025
Accuracy Is Dead: Calibration, Discrimination, and Other Metrics You Actually Need

Accuracy Is Dead: Calibration, Discrimination, and Other Metrics You Actually Need A deep dive into advanced evaluation for data scientists The post Accuracy Is Dead: Calibration, Discrimination, and Other Metrics You Actually Need appeared first on Towards Data Science. Pol Marin Go to original source

July 15, 2025
Know What You Don’t Know: Uncertainty Calibration of Process Reward Models

Know What You Don’t Know: Uncertainty Calibration of Process Reward Models arXiv:2506.09338v1 Announce Type: new Abstract: Process reward models (PRMs) play a central role in guiding inference-time scaling algorithms for large language models (LLMs). However, we observe that even state-of-the-art PRMs can be poorly calibrated and often overestimate success probabilities. To address this, we present…

June 12, 2025
Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning arXiv:2505.23783v1 Announce Type: new Abstract: In-Context Learning (ICL) allows Large Language Models (LLMs) to adapt to new tasks with just a few examples, but their predictions often suffer from systematic biases, leading to unstable performances in classification. While calibration techniques are proposed to…

June 2, 2025
Evaluating Uncertainty in Deep Gaussian Processes

Evaluating Uncertainty in Deep Gaussian Processes arXiv:2504.17719v1 Announce Type: new Abstract: Reliable uncertainty estimates are crucial in modern machine learning. Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty quantification grounded in Bayesian principles. However, their empirical calibration and robustness under distribution shift relative to baselines…

April 25, 2025
Advancing calibration for stochastic agent-based models in epidemiology with Stein variational inference and Gaussian process surrogates

Advancing calibration for stochastic agent-based models in epidemiology with Stein variational inference and Gaussian process surrogates arXiv:2502.19550v1 Announce Type: new Abstract: Accurate calibration of stochastic agent-based models (ABMs) in epidemiology is crucial to make them useful in public health policy decisions and interventions. Traditional calibration methods, e.g., Markov Chain Monte Carlo (MCMC), that yield a…

February 28, 2025
Understanding Model Calibration: A Gentle Introduction & Visual Exploration

Understanding Model Calibration: A Gentle Introduction & Visual Exploration How Reliable Are Your Predictions? About To be considered reliable, a model must be calibrated so that its confidence in each decision closely reflects its true outcome. In this blog post we’ll take a look at the most commonly used definition for calibration and then dive…

February 12, 2025
Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction arXiv:2502.05676v1 Announce Type: new Abstract: Ensuring model calibration is critical for reliable predictions, yet popular distribution-free methods, such as histogram binning and isotonic regression, provide only asymptotic guarantees. We introduce a unified framework for Venn and Venn-Abers calibration, generalizing Vovk’s binary classification approach to arbitrary…

February 11, 2025
Model Calibration, Explained: A Visual Guide with Code Examples for Beginners

Model Calibration, Explained: A Visual Guide with Code Examples for Beginners MODEL EVALUATION & OPTIMIZATION When all models have similar accuracy, now what? You’ve trained several classification models, and they all seem to be performing well with high accuracy scores. Congratulations! But hold on — is one model truly better than the others? Accuracy alone doesn’t tell the…

January 11, 2025