Regularizing axis-aligned ensembles via data rotations that favor simpler learners

Blaser, Rico and Fryzlewicz, Piotr ORCID: 0000-0002-9676-902X (2021) Regularizing axis-aligned ensembles via data rotations that favor simpler learners. Statistics and Computing, 31 (2). ISSN 0960-3174

Text (Blaser_regularizing-axis-aligned-ensembles--published) - Published Version
Available under License Creative Commons Attribution.
Download (2MB)

Author

Identification Number: 10.1007/s11222-020-09973-3

Abstract

To overcome the inherent limitations of axis-aligned base learners in ensemble learning, several methods of rotating the feature space have been discussed in the literature. In particular, smoother decision boundaries can often be obtained from axis-aligned ensembles by rotating the feature space. In the present paper, we introduce a low-cost regularization technique that favors rotations which produce compact base learners. The restated problem adds a shrinkage term to the loss function that explicitly accounts for the complexity of the base learners. For example, for tree-based ensembles, we apply a penalty based on the median number of nodes and the median depth of the trees in the forest. Rather than jointly minimizing prediction error and model complexity, which is computationally infeasible, we first generate a prioritized weighting of the available feature rotations that promotes lower model complexity and subsequently minimize prediction errors on each of the selected rotations. We show that the resulting ensembles tend to be significantly more dense, faster to evaluate, and competitive at generalizing in out-of-sample predictions.

Item Type:	Article
Official URL:	https://www.springer.com/journal/11222
Additional Information:	© 2021 The Authors
Divisions:	Statistics
Subjects:	H Social Sciences > HA Statistics
Date Deposited:	17 Dec 2020 11:45
Last Modified:	07 Jun 2025 23:23
URI:	http://eprints.lse.ac.uk/id/eprint/107935

Actions (login required)

View Item

Download Statistics

Downloads

Downloads per month over past year

View more statistics