Library Header Image
LSE Research Online LSE Library Services

Ranking-based variable selection for high-dimensional data

Baranowski, Rafal, Chen, Yining and Fryzlewicz, Piotr (2018) Ranking-based variable selection for high-dimensional data. Statistica Sinica. ISSN 1017-0405 (In Press)

[img] Text - Accepted Version
Pending embargo until 1 January 2100.

Download (1MB) | Request a copy


We propose Ranking-Based Variable Selection (RBVS), a technique aiming to identify important variables influencing the response in high-dimensional data. The RBVS algorithm uses subsampling to identify the set of covariates which non-spuriously appears at the top of a chosen variable ranking. We study the conditions under which such set is unique and show that it can be successfully recovered from the data by our procedure. Unlike many existing high-dimensional variable selection techniques, within all the relevant variables, RBVS distinguishes between the important and unimportant variables, and aims to recover only the important ones. Moreover, RBVS does not require any model restrictions on the relationship between the response and covariates, it is therefore widely applicable, both in a parametric and non-parametric context. We illustrate its good practical performance in a comparative simulation study. The RBVS algorithm is implemented in the publicly available R package rbvs.

Item Type: Article
Official URL:
Additional Information: © 2018 Institute of Statistical Science, Academia Sinica
Divisions: Statistics
Subjects: H Social Sciences > HA Statistics
Sets: Departments > Statistics
Date Deposited: 18 Sep 2018 10:02
Last Modified: 11 Mar 2019 00:09

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics