Cookies?
Library Header Image
LSE Research Online LSE Library Services

Cluster detection and clustering with random start forward searches

Atkinson, Anthony C. and Riani, Marco and Cerioli, Andrea (2017) Cluster detection and clustering with random start forward searches. Journal of Applied Statistics. ISSN 0266-4763 (In Press)

[img] PDF - Accepted Version
Restricted to Repository staff only

Download (910kB) | Request a copy

Abstract

The forward search is a method of robust data analysis in which outlier free subsets of the data of increasing size are used in model fitting; the data are then ordered by closeness to the model. Here the forward search, with many random starts, is used to cluster multivariate data. These random starts lead to the diagnostic identification of tentative clusters. Application of the forward search to the proposed individual clusters leads to the establishment of cluster membership through the identification of non-cluster members as outlying. The method requires no prior information on the number of clusters and does not seek to classify all observations. These properties are illustrated by the analysis of 200 six-dimensional observations on Swiss banknotes. The importance of linked plots and brushing in elucidating data structures is illustrated. We also provide an automatic method for determining cluster centres and compare the behaviour of our method with model-based clustering. In a simulated example with 8 clusters our method provides more stable and accurate solutions than model-based clustering. We consider the computational requirements of both procedures.

Item Type: Article
Official URL: http://www.tandfonline.com/toc/cjas20/current
Subjects: H Social Sciences > HA Statistics
Sets: Departments > Statistics
Date Deposited: 04 Apr 2017 09:57
Last Modified: 25 Sep 2017 09:21
URI: http://eprints.lse.ac.uk/id/eprint/72291

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics