Tag: explanations

Aligned explanations in neural networks

Aligned explanations in neural networks arXiv:2601.04378v1 Announce Type: cross Abstract: Feature attribution is the dominant paradigm for explaining deep neural networks. However, most existing methods only loosely reflect the model’s prediction-making process, thereby merely white-painting the black box. We argue that explanatory alignment is a key aspect of trustworthiness in prediction tasks: explanations must be…

January 9, 2026
Interpretable Model-Aware Counterfactual Explanations for Random Forest

Interpretable Model-Aware Counterfactual Explanations for Random Forest arXiv:2510.27397v1 Announce Type: new Abstract: Despite their enormous predictive power, machine learning models are often unsuitable for applications in regulated industries such as finance, due to their limited capacity to provide explanations. While model-agnostic frameworks such as Shapley values have proved to be convenient and popular, they rarely…

November 3, 2025
Performative Validity of Recourse Explanations

Performative Validity of Recourse Explanations arXiv:2506.15366v1 Announce Type: new Abstract: When applicants get rejected by an algorithmic decision system, recourse explanations provide actionable suggestions for how to change their input features to get a positive evaluation. A crucial yet overlooked phenomenon is that recourse explanations are performative: When many applicants act according to their recommendations,…

June 19, 2025