Automatic aspect discrimination in data clustering

Danilo Horta*, Ricardo J.G.B. Campello

*Kontaktforfatter

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Abstract

The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.

OriginalsprogEngelsk
TidsskriftPattern Recognition
Vol/bind45
Udgave nummer12
Sider (fra-til)4370-4388
ISSN0031-3203
DOI
StatusUdgivet - dec. 2012
Udgivet eksterntJa

Bibliografisk note

Funding Information:
The authors thank CNPq and FAPESP for their financial support.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Automatic aspect discrimination in data clustering'. Sammen danner de et unikt fingeraftryk.

Citationsformater