Cookies?
Library Header Image
LSE Research Online LSE Library Services

Stepwise searching for feature variables in high-dimensional linear regression

An, Hongzhi, Huang, Da, Yao, Qiwei and Zhang, Cun-Hui (2008) Stepwise searching for feature variables in high-dimensional linear regression. The London School of Economics and Political Science, London, UK.

[img]
Preview
PDF
Download (405Kb) | Preview

Abstract

We investigate the classical stepwise forward and backward search methods for selecting sparse models in the context of linear regression with the number of candidate variables p greater than the number of observations n. In the noiseless case, we give definite upper bounds for the number of forward search steps to recover all relevant variables, if each step of the forward search is approximately optimal in reduction of residual sum of squares, up to a fraction. These upper bounds for the number of steps are of the same order as the size of a true sparse model under mild conditions. In the presence of noise, traditional information criteria such as BIC and AIC are designed for p < n and may fail spectacularly when p is greater than n. To overcome this difficulty, two information criteria BICP and BICC are proposed to serve as the stopping rules in the stepwise searches. The forward search with noise is proved to be approximately optimal with high probability, compared with the optimal forward search without noise, so that the upper bounds for the number of steps still apply. The proposed BICP is proved to stop the forward search as soon as it recovers all relevant variables and remove all extra variables in the backward deletion. This leads to the selection consistency of the estimated models. The proposed methods are illustrated in a simulation study which indicates that the new methods outperform a counterpart LASSO selector with a penalty parameter set at a fixed value.

Item Type: Monograph (Working Paper)
Official URL: http://www.lse.ac.uk/statistics/home.aspx
Additional Information: © 2008 The Authors
Library of Congress subject classification: Q Science > QA Mathematics
Sets: Departments > Statistics
Rights: http://www.lse.ac.uk/library/usingTheLibrary/academicSupport/OA/depositYourResearch.aspx
Date Deposited: 02 Aug 2013 10:35
URL: http://eprints.lse.ac.uk/51349/

Actions (login required)

Record administration - authorised staff only Record administration - authorised staff only

Downloads

Downloads per month over past year

View more statistics