Similarity-Based Unsupervised Evaluation of Outlier Detection

Henrique O. Marques*, Arthur Zimek, Ricardo J.G.B. Campello, Jörg Sander

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The evaluation of unsupervised algorithm results is one of the most challenging tasks in data mining research. Where labeled data are not available, one has to use in practice the so-called internal evaluation, which is based solely on the data and the assessed solutions themselves. In unsupervised cluster analysis, indices for internal evaluation of clustering solutions have been studied for decades, with a multitude of indices available, based on different criteria. In unsupervised outlier detection, however, this problem has only recently received some attention, and still very few indices are available. In this paper, we provide a new internal index based on criteria different from the ones available in the literature. The index is based on a (generic) similarity measure to efficiently evaluate candidate outlier detection solutions in a completely unsupervised way. We evaluate and compare this index against existing indices in terms of quality and run time performance using collections of both real and synthetic datasets.

Original languageEnglish
Title of host publicationSimilarity Search and Applications - 15th International Conference, SISAP 2022, Proceedings
EditorsTomáš Skopal, Jakub Lokoč, Fabrizio Falchi, Maria Luisa Sapino, Ilaria Bartolini, Marco Patella
PublisherSpringer Science+Business Media
Publication date2022
Pages234-248
ISBN (Print)9783031178481
DOIs
Publication statusPublished - 2022
Event15th International Conference on Similarity Search and Applications, SISAP 2022 - Bologna, Italy
Duration: 5. Oct 20227. Oct 2022

Conference

Conference15th International Conference on Similarity Search and Applications, SISAP 2022
Country/TerritoryItaly
CityBologna
Period05/10/202207/10/2022
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13590 LNCS
ISSN0302-9743

Keywords

  • Model selection
  • Outlier detection
  • Unsupervised evaluation
  • Validation

Fingerprint

Dive into the research topics of 'Similarity-Based Unsupervised Evaluation of Outlier Detection'. Together they form a unique fingerprint.

Cite this