Implementing bootstrap in ward´s algorithm to estimate the number of clusters

Sueli A. Mingoti, Francisco N. Felix

Resumo


In this paper we show how bootstrap can be implemented in hierarchical clustering algorithms as a strategy to estimate the number of clusters (k). Ward´s algorithm was chosen as an example. The estimation of k is based on a similarity coefficient and three statistical stopping rules, pseudo F, pseudo T2 and CCC. The performance of the estimation procedure was evaluated through Monte Carlo simulation considering data consisting of correlated and uncorrelated variables, nonoverlapping and overlapping clusters. The estimation procedure discussed in this paper can be used with clustering algorithms other than Ward´s and also to provide initial solutions for non-hierarchical grouping methods.

Palavras-chave


Ward´s algorithm; Estimation of number of clusters; Bootstrap

Texto completo:

PDF


DOI: https://doi.org/10.7177/sg.2009.V4N2A1

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

 

ISSN: 1980-5160

Rua Passo da Pátria 156, bloco E, sala Sistemas & Gestão, Escola de Engenharia, São Domingos, Niterói, RJ, CEP: 24210-240

Tel.: (21) 2629-5616

Correspondência: Caixa Postal LATEC: 100175, CEP 24.020-971, Niterói, RJ