Tag: selection
-
Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents
Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents arXiv:2603.03401v1 Announce Type: new Abstract: This paper proposes a novel parameter selection strategy for kernel-based gradient descent (KGD) algorithms, integrating bias-variance analysis with the splitting method. We introduce the concept of empirical effective dimension to quantify iteration increments in KGD, deriving an adaptive parameter selection strategy…
-
Unsupervised Feature Selection via Robust Autoencoder and Adaptive Graph Learning
Unsupervised Feature Selection via Robust Autoencoder and Adaptive Graph Learning arXiv:2512.18720v1 Announce Type: new Abstract: Effective feature selection is essential for high-dimensional data analysis and machine learning. Unsupervised feature selection (UFS) aims to simultaneously cluster data and identify the most discriminative features. Most existing UFS methods linearly project features into a pseudo-label space for clustering,…
-
When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing
When Features Beat Noise: A Feature Selection Technique Through Noise-Based Hypothesis Testing arXiv:2511.20851v1 Announce Type: new Abstract: Feature selection has remained a daunting challenge in machine learning and artificial intelligence, where increasingly complex, high-dimensional datasets demand principled strategies for isolating the most informative predictors. Despite widespread adoption, many established techniques suffer from notable limitations; some…
-
FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection
FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection arXiv:2511.19476v1 Announce Type: new Abstract: Coreset selection compresses large datasets into compact, representative subsets, reducing the energy and computational burden of training deep neural networks. Existing methods are either: (i) DNN-based, which are tied to model-specific parameters and introduce architectural bias; or (ii) DNN-free, which rely on…
-
Differentially Private High-dimensional Variable Selection via Integer Programming
Differentially Private High-dimensional Variable Selection via Integer Programming arXiv:2510.22062v1 Announce Type: new Abstract: Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression – known as Best Subset Selection (BSS) – with millions…
-
Variable Selection Using Relative Importance Rankings
Variable Selection Using Relative Importance Rankings arXiv:2509.10853v1 Announce Type: new Abstract: Although conceptually related, variable selection and relative importance (RI) analysis have been treated quite differently in the literature. While RI is typically used for post-hoc model explanation, this paper explores its potential for variable ranking and filter-based selection before model creation. Specifically, we anticipate…
-
Online Conformal Selection with Accept-to-Reject Changes
Online Conformal Selection with Accept-to-Reject Changes arXiv:2508.13838v1 Announce Type: new Abstract: Selecting a subset of promising candidates from a large pool is crucial across various scientific and real-world applications. Conformal selection offers a distribution-free and model-agnostic framework for candidate selection with uncertainty quantification. While effective in offline settings, its application to online scenarios, where data…
-
Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection
Subgrid BoostCNN: Efficient Boosting of Convolutional Networks via Gradient-Guided Feature Selection arXiv:2507.22842v1 Announce Type: new Abstract: Convolutional Neural Networks (CNNs) have achieved remarkable success across a wide range of machine learning tasks by leveraging hierarchical feature learning through deep architectures. However, the large number of layers and millions of parameters often make CNNs computationally expensive…
-
GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering
GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering arXiv:2507.10956v1 Announce Type: new Abstract: It is important to identify the discriminative features for high dimensional clustering. However, due to the lack of cluster labels, the regularization methods developed for supervised feature selection can not be directly applied. To learn the…
-
Online Conformal Model Selection for Nonstationary Time Series
Online Conformal Model Selection for Nonstationary Time Series arXiv:2506.05544v1 Announce Type: new Abstract: This paper introduces the MPS (Model Prediction Set), a novel framework for online model selection for nonstationary time series. Classical model selection methods, such as information criteria and cross-validation, rely heavily on the stationarity assumption and often fail in dynamic environments which…
-
Coreset selection for the Sinkhorn divergence and generic smooth divergences
Coreset selection for the Sinkhorn divergence and generic smooth divergences arXiv:2504.20194v1 Announce Type: new Abstract: We introduce CO2, an efficient algorithm to produce convexly-weighted coresets with respect to generic smooth divergences. By employing a functional Taylor expansion, we show a local equivalence between sufficiently regular losses and their second order approximations, reducing the coreset selection…
-
Explained: How Does L1 Regularization Perform Feature Selection?
Explained: How Does L1 Regularization Perform Feature Selection? Feature Selection is the process of selecting an optimal subset of features from a given set of features; an optimal feature subset is the one which maximizes the performance of the model on the given task. Feature selection can be a manual or rather explicit process when…
-
Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases and Beyond
Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases and Beyond arXiv:2504.07133v1 Announce Type: new Abstract: We revisit the problem of estimating $k$ linear regressors with self-selection bias in $d$ dimensions with the maximum selection criterion, as introduced by Cherapanamjeri, Daskalakis, Ilyas, and Zampetakis [CDIZ23, STOC’23]. Our main result is a $operatorname{poly}(d,k,1/varepsilon) + {k}^{O(k)}$…
-
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
Dynamic Assortment Selection and Pricing with Censored Preference Feedback arXiv:2504.02324v1 Announce Type: new Abstract: In this study, we investigate the problem of dynamic multi-product selection and pricing by introducing a novel framework based on a textit{censored multinomial logit} (C-MNL) choice model. In this model, sellers present a set of products with prices, and buyers filter…
-
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding arXiv:2503.20546v1 Announce Type: new Abstract: We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention, focusing on continuous random variables. In settings without selection bias or confounding, $E[Y|do(X)] =…
-
Online Selective Conformal Prediction: Errors and Solutions
Online Selective Conformal Prediction: Errors and Solutions arXiv:2503.16809v1 Announce Type: new Abstract: In online selective conformal inference, data arrives sequentially, and prediction intervals are constructed only when an online selection rule is met. Since online selections may break the exchangeability between the selected test datum and the rest of the data, one must correct for…
-
Distributionally Robust Coreset Selection under Covariate Shift
Distributionally Robust Coreset Selection under Covariate Shift arXiv:2501.14253v1 Announce Type: new Abstract: Coreset selection, which involves selecting a small subset from an existing training dataset, is an approach to reducing training data, and various approaches have been proposed for this method. In practical situations where these methods are employed, it is often the case that…
-
Statistical Inference for Sequential Feature Selection after Domain Adaptation
Statistical Inference for Sequential Feature Selection after Domain Adaptation arXiv:2501.09933v1 Announce Type: new Abstract: In high-dimensional regression, feature selection methods, such as sequential feature selection (SeqFS), are commonly used to identify relevant features. When data is limited, domain adaptation (DA) becomes crucial for transferring knowledge from a related source domain to a target domain, improving…
-
On the use of Statistical Learning Theory for model selection in Structural Health Monitoring
On the use of Statistical Learning Theory for model selection in Structural Health Monitoring arXiv:2501.08050v1 Announce Type: new Abstract: Whenever data-based systems are employed in engineering applications, defining an optimal statistical representation is subject to the problem of model selection. This paper focusses on how well models can generalise in Structural Health Monitoring (SHM). Although…
-
Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age
Variable Selection Methods for Multivariate, Functional, and Complex Biomedical Data in the AI Age arXiv:2501.06868v1 Announce Type: new Abstract: Many problems within personalized medicine and digital health rely on the analysis of continuous-time functional biomarkers and other complex data structures emerging from high-resolution patient monitoring. In this context, this work proposes new optimization-based variable selection…
-
Fr’echet regression for multi-label feature selection with implicit regularization
Fr’echet regression for multi-label feature selection with implicit regularization arXiv:2412.18247v1 Announce Type: new Abstract: Fr’echet regression extends linear regression to model complex responses in metric spaces, making it particularly relevant for multi-label regression, where each instance can have multiple associated labels. However, variable selection within this framework remains underexplored. In this paper, we pro pose…