Gradient Learning With the Mode-Induced Loss: Consistency Analysis and Applications | IEEE Journals & Magazine | IEEE Xplore

Gradient Learning With the Mode-Induced Loss: Consistency Analysis and Applications


Abstract:

Variable selection methods aim to select the key covariates related to the response variable for learning problems with high-dimensional data. Typical methods of variable...Show More

Abstract:

Variable selection methods aim to select the key covariates related to the response variable for learning problems with high-dimensional data. Typical methods of variable selection are formulated in terms of sparse mean regression with a parametric hypothesis class, such as linear functions or additive functions. Despite rapid progress, the existing methods depend heavily on the chosen parametric function class and are incapable of handling variable selection for problems where the data noise is heavy-tailed or skewed. To circumvent these drawbacks, we propose sparse gradient learning with the mode-induced loss (SGLML) for robust model-free (MF) variable selection. The theoretical analysis is established for SGLML on the upper bound of excess risk and the consistency of variable selection, which guarantees its ability for gradient estimation from the lens of gradient risk and informative variable identification under mild conditions. Experimental analysis on the simulated and real data demonstrates the competitive performance of our method over the previous gradient learning (GL) methods.
Page(s): 9686 - 9699
Date of Publication: 18 January 2023

ISSN Information:

PubMed ID: 37021851

Funding Agency:


I. Introduction

Due to the demand of computation feasibility and result interpretability, variable selection associated with high-dimensional data has attracted increasing attentions in the statistics and machine learning communities [1], [2], [3], [4]. There is a wide spectrum of variable selection methods, which can be divided mainly into linear models, nonlinear additive models, and partial linear models (PLMs). Under linear model assumption, active variables are selected directly by the information metric of covariates (e.g., the Bayesian information criterion (BIC) [5] and Akaike information criterion (AIC) [6]), or by Tikhonov regularization schemes with sparse penalty on regression coefficients (e.g., least absolute shrinkage and selection operator (LASSO) [1], smoothly clipped absolute deviation (SCAD) [7], and least angle regression (LARS) [8]). As a natural extension of linear models, additive models are proposed for nonlinear approximation and variable selection [9], [10], [11], where popular algorithms include component selection and smoothing operator (COSSO) [12], nonparametric independence screening (NIS) [13], sparse additive models (SpAM) [14], GroupSpAM [15], and sparse modal additive model (SpMAM) [16], [17]. As a tradeoff between the linear and nonlinear models, PLMs assume some covariates are linearly related to the response while the others are nonlinear [18]. Some efforts have been made for the PLM-based variable selection and function estimation, such as linear and nonlinear discoverer (LAND) [19], the model pursuit approach [20], and the sparse PLMs [21].

Contact IEEE to Subscribe

References

References is not available for this document.