Active learning strategies for semi-supervised DBSCAN

Jundong Li, Jörg Sander, Ricardo Campello, Arthur Zimek

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The semi-supervised, density-based clustering algorithm SSDBSCAN extracts clusters of a given dataset from different density levels by using a small set of labeled objects. A critical assumption of SSDBSCAN is, however, that at least one labeled object for each natural cluster in the dataset is provided. This assumption may be unrealistic when only a very few labeled objects can be provided, for instance due to the cost associated with determining the class label of an object. In this paper, we introduce a novel active learning strategy to select "most representative" objects whose class label should be determined as input for SSDBSCAN. By incorporating a Laplacian Graph Regularizer into a Local Linear Reconstruction method, our proposed algorithm selects objects that can represent the whole data space well. Experiments on synthetic and real datasets show that using the proposed active learning strategy, SSDBSCAN is able to extract more meaningful clusters even when only very few labeled objects are provided.

Original languageEnglish
Title of host publicationAdvances in Artificial Intelligence : Proceedings of the 27th Canadian Conference on Artificial Intelligence
EditorsM. Sokolova, P. van Beek
PublisherSpringer VS
Publication date2014
Pages179-190
ISBN (Print)978-3-319-06482-6
ISBN (Electronic)978-3-319-06483-3
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event27th Canadian Conference on Artificial Intelligence - Montreal, Canada
Duration: 6. May 20149. May 2014

Conference

Conference27th Canadian Conference on Artificial Intelligence
Country/TerritoryCanada
CityMontreal
Period06/05/201409/05/2014
Sponsor'Nana Traiteur par l'Assommoir', Canadian Artificial Intelligence Association (CAIAC), et al., GRAND (Graphics, Animation and New Media) Research Network, Grevin, Polytechnique Montreal
SeriesLecture Notes in Computer Science
Volume8436
ISSN0302-9743

Keywords

  • Active learning
  • Density-based clustering
  • Semi-supervised clustering

Fingerprint

Dive into the research topics of 'Active learning strategies for semi-supervised DBSCAN'. Together they form a unique fingerprint.

Cite this