Tag: variable

  • 5 Ways to Implement Variable Discretization

    5 Ways to Implement Variable Discretization An overview of powerful methods for transforming continuous variables into discrete ones The post 5 Ways to Implement Variable Discretization appeared first on Towards Data Science. Rukshan Pramoditha Go to original source

  • Selecting Optimal Variable Order in Autoregressive Ising Models

    Selecting Optimal Variable Order in Autoregressive Ising Models arXiv:2602.20394v1 Announce Type: new Abstract: Autoregressive models enable tractable sampling from learned probability distributions, but their performance critically depends on the variable ordering used in the factorization via complexities of the resulting conditional distributions. We propose to learn the Markov random field describing the underlying data, and…

  • One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing

    One Permutation Is All You Need: Fast, Reliable Variable Importance and Model Stress-Testing arXiv:2512.13892v1 Announce Type: new Abstract: Reliable estimation of feature contributions in machine learning models is essential for trust, transparency and regulatory compliance, especially when models are proprietary or otherwise operate as black boxes. While permutation-based methods are a standard tool for this…

  • Differentially Private High-dimensional Variable Selection via Integer Programming

    Differentially Private High-dimensional Variable Selection via Integer Programming arXiv:2510.22062v1 Announce Type: new Abstract: Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression – known as Best Subset Selection (BSS) – with millions…

  • Variable Selection Using Relative Importance Rankings

    Variable Selection Using Relative Importance Rankings arXiv:2509.10853v1 Announce Type: new Abstract: Although conceptually related, variable selection and relative importance (RI) analysis have been treated quite differently in the literature. While RI is typically used for post-hoc model explanation, this paper explores its potential for variable ranking and filter-based selection before model creation. Specifically, we anticipate…

  • Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction

    Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction arXiv:2508.08724v1 Announce Type: new Abstract: Recent advances in machine learning have greatly expanded the repertoire of predictive methods for medical imaging. However, the interpretability of complex models remains a challenge, which limits their utility in medical applications. Recently, model-agnostic methods have been proposed to measure…

  • When Predictors Collide: Mastering VIF in Multicollinear Regression

    When Predictors Collide: Mastering VIF in Multicollinear Regression In regression models, the independent variables must be not or only slightly dependent on each other, i.e. that they are not correlated. However, if such a dependency exists, this is referred to as Multicollinearity and leads to unstable models and results that are difficult to interpret. The…

  • Causal Bayesian Optimization with Unknown Graphs

    Causal Bayesian Optimization with Unknown Graphs arXiv:2503.19554v1 Announce Type: new Abstract: Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are…

  • Identifying metric structures of deep latent variable models

    Identifying metric structures of deep latent variable models arXiv:2502.13757v1 Announce Type: new Abstract: Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings of the studied phenomena. Unfortunately, these latent representations are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting…

  • Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection

    Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection arXiv:2501.17889v1 Announce Type: new Abstract: Variable selection plays a crucial role in enhancing modeling effectiveness across diverse fields, addressing the challenges posed by high-dimensional datasets of correlated variables. This work introduces a novel approach namely Knockoff with over-parameterization (Knoop) to enhance Knockoff filters for variable…

  • Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age

    Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age arXiv:2501.06868v1 Announce Type: new Abstract: Many problems within personalized medicine and digital health rely on the analysis of continuous-time functional biomarkers and other complex data structures emerging from high-resolution patient monitoring. In this context, this work proposes new optimization-based variable selection…

  • Effortless Data Handling: Find Variables Across Multiple Data Files with R

    Effortless Data Handling: Find Variables Across Multiple Data Files with R A practical solution with code and workflow Lost in a maze of datasets and endless data dictionaries? Say goodbye to tedious variable hunting! Discover how to quickly identify and extract the variables you need from multiple SAS files using two simple R functions. Streamline your…