Efficiency issues of evolutionary k-means

M. C. Naldi, R. J.G.B. Campello, E. R. Hruschka, A. C.P.L.F. Carvalho

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Abstract

One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets.

OriginalsprogEngelsk
TidsskriftApplied Soft Computing Journal
Vol/bind11
Udgave nummer2
Sider (fra-til)1938-1952
ISSN1568-4946
DOI
StatusUdgivet - mar. 2011
Udgivet eksterntJa

Bibliografisk note

Funding Information:
The authors acknowledge the Brazilian Research Agencies CAPES, CNPq, and FAPESP for their financial support to this work. They also acknowledge Vinícius Santino Alves for making available the original F-EAC’s code and for his kind support.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Efficiency issues of evolutionary k-means'. Sammen danner de et unikt fingeraftryk.

Citationsformater