Baranowski, Rafal, Chen, Yining ORCID: 0000-0003-1697-1920 and Fryzlewicz, Piotr ORCID: 0000-0002-9676-902X
(2020)
Ranking-based variable selection for high-dimensional data.
Statistica Sinica, 30 (3).
1485 - 1516.
ISSN 1017-0405
Abstract
We propose a ranking-based variable selection (RBVS) technique that identifies important variables influencing the response in high-dimensional data. RBVS uses subsampling to identify the covariates that appear nonspuriously at the top of a chosen variable ranking. We study the conditions under which such a set is unique, and show that it can be recovered successfully from the data by our procedure. Unlike many existing high-dimensional variable selection techniques, among all relevant variables, RBVS distinguishes between important and unimportant variables, and aims to recover only the important ones. Moreover, RBVS does not require model restrictions on the relationship between the response and the covariates, and, thus, is widely applicable in both parametric and nonparametric contexts. Lastly, we illustrate the good practical performance of the proposed technique by means of a comparative simulation study. The RBVS algorithm is implemented in rbvs, a publicly available R package.
Actions (login required)
|
View Item |