Cookies?
Library Header Image
LSE Research Online LSE Library Services

Ranking-based variable selection for high-dimensional data

Baranowski, Rafal and Chen, Yining and Fryzlewicz, Piotr (2018) Ranking-based variable selection for high-dimensional data. Statistica Sinica. ISSN 1017-0405 (In Press)

[img] Text - Accepted Version
Restricted to Repository staff only

Download (1MB) | Request a copy

Abstract

We propose Ranking-Based Variable Selection (RBVS), a technique aiming to identify important variables influencing the response in high-dimensional data. The RBVS algorithm uses subsampling to identify the set of covariates which non-spuriously appears at the top of a chosen variable ranking. We study the conditions under which such set is unique and show that it can be successfully recovered from the data by our procedure. Unlike many existing high-dimensional variable selection techniques, within all the relevant variables, RBVS distinguishes between the important and unimportant variables, and aims to recover only the important ones. Moreover, RBVS does not require any model restrictions on the relationship between the response and covariates, it is therefore widely applicable, both in a parametric and non-parametric context. We illustrate its good practical performance in a comparative simulation study. The RBVS algorithm is implemented in the publicly available R package rbvs.

Item Type: Article
Official URL: http://www3.stat.sinica.edu.tw/statistica/
Additional Information: © 2018 Institute of Statistical Science, Academia Sinica
Subjects: H Social Sciences > HA Statistics
Sets: Departments > Statistics
Date Deposited: 18 Sep 2018 10:02
Last Modified: 20 Sep 2018 10:18
URI: http://eprints.lse.ac.uk/id/eprint/90233

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics