On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued

G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent, M. E. Houle

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Abstract

The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection task is typically unknown. Furthermore, the biases of commonly-used evaluation measures are not fully understood. It is thus difficult to ascertain the extent to which newly-proposed outlier detection methods improve over established methods. We performed an extensive experimental study (Campos et al., 2016) on the performance of a representative set of standard $k$ nearest neighborhood-based methods for unsupervised outlier detection, across a wide variety of datasets prepared for this purpose. Based on the overall performance of the outlier detection methods, we provide a characterization of the datasets themselves, and discuss their suitability as outlier detection benchmark sets. We also examine the most commonly-used measures for comparing the performance of different methods, and suggest adaptations that are more suitable for the evaluation of outlier detection results. We present the results from our previous publication (Campos et al., 2016) as well as additional observations and measures. All results are available online in the repository at: http://www.dbs.ifi.lmu.de/research/outlier-evaluation/
Original languageEnglish
Title of host publicationProceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany
Number of pages1
Publication date2016
Pages234
Publication statusPublished - 2016

Fingerprint

Data mining

Cite this

Campos, G. O., Zimek, A., Sander, J., Campello, R. J. G. B., Micenková, B., Schubert, E., ... Houle, M. E. (2016). On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued. In Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany (pp. 234)
Campos, G. O. ; Zimek, A. ; Sander, J. ; Campello, R. J. G. B. ; Micenková, B. ; Schubert, E. ; Assent, I. ; Houle, M. E. / On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued. Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany. 2016. pp. 234
@inbook{9665bb608390464e8dfba767078f7ac4,
title = "On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued",
abstract = "The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection task is typically unknown. Furthermore, the biases of commonly-used evaluation measures are not fully understood. It is thus difficult to ascertain the extent to which newly-proposed outlier detection methods improve over established methods. We performed an extensive experimental study (Campos et al., 2016) on the performance of a representative set of standard $k$ nearest neighborhood-based methods for unsupervised outlier detection, across a wide variety of datasets prepared for this purpose. Based on the overall performance of the outlier detection methods, we provide a characterization of the datasets themselves, and discuss their suitability as outlier detection benchmark sets. We also examine the most commonly-used measures for comparing the performance of different methods, and suggest adaptations that are more suitable for the evaluation of outlier detection results. We present the results from our previous publication (Campos et al., 2016) as well as additional observations and measures. All results are available online in the repository at: http://www.dbs.ifi.lmu.de/research/outlier-evaluation/",
author = "Campos, {G. O.} and A. Zimek and J. Sander and Campello, {R. J. G. B.} and B. Micenkov{\'a} and E. Schubert and I. Assent and Houle, {M. E.}",
year = "2016",
language = "English",
pages = "234",
booktitle = "Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany",

}

Campos, GO, Zimek, A, Sander, J, Campello, RJGB, Micenková, B, Schubert, E, Assent, I & Houle, ME 2016, On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued. in Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany. pp. 234.

On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued. / Campos, G. O.; Zimek, A.; Sander, J.; Campello, R. J. G. B.; Micenková, B.; Schubert, E.; Assent, I.; Houle, M. E.

Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany. 2016. p. 234.

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

TY - CHAP

T1 - On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued

AU - Campos, G. O.

AU - Zimek, A.

AU - Sander, J.

AU - Campello, R. J. G. B.

AU - Micenková, B.

AU - Schubert, E.

AU - Assent, I.

AU - Houle, M. E.

PY - 2016

Y1 - 2016

N2 - The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection task is typically unknown. Furthermore, the biases of commonly-used evaluation measures are not fully understood. It is thus difficult to ascertain the extent to which newly-proposed outlier detection methods improve over established methods. We performed an extensive experimental study (Campos et al., 2016) on the performance of a representative set of standard $k$ nearest neighborhood-based methods for unsupervised outlier detection, across a wide variety of datasets prepared for this purpose. Based on the overall performance of the outlier detection methods, we provide a characterization of the datasets themselves, and discuss their suitability as outlier detection benchmark sets. We also examine the most commonly-used measures for comparing the performance of different methods, and suggest adaptations that are more suitable for the evaluation of outlier detection results. We present the results from our previous publication (Campos et al., 2016) as well as additional observations and measures. All results are available online in the repository at: http://www.dbs.ifi.lmu.de/research/outlier-evaluation/

AB - The evaluation of unsupervised outlier detection algorithms is a constant challenge in data mining research. Little is known regarding the strengths and weaknesses of different standard outlier detection models, and the impact of parameter choices for these algorithms. The scarcity of appropriate benchmark datasets with ground truth annotation is a significant impediment to the evaluation of outlier methods. Even when labeled datasets are available, their suitability for the outlier detection task is typically unknown. Furthermore, the biases of commonly-used evaluation measures are not fully understood. It is thus difficult to ascertain the extent to which newly-proposed outlier detection methods improve over established methods. We performed an extensive experimental study (Campos et al., 2016) on the performance of a representative set of standard $k$ nearest neighborhood-based methods for unsupervised outlier detection, across a wide variety of datasets prepared for this purpose. Based on the overall performance of the outlier detection methods, we provide a characterization of the datasets themselves, and discuss their suitability as outlier detection benchmark sets. We also examine the most commonly-used measures for comparing the performance of different methods, and suggest adaptations that are more suitable for the evaluation of outlier detection results. We present the results from our previous publication (Campos et al., 2016) as well as additional observations and measures. All results are available online in the repository at: http://www.dbs.ifi.lmu.de/research/outlier-evaluation/

UR - http://www.dbs.ifi.lmu.de/research/outlier-evaluation/

M3 - Book chapter

SP - 234

BT - Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany

ER -

Campos GO, Zimek A, Sander J, Campello RJGB, Micenková B, Schubert E et al. On the Evaluation of Outlier Detection: Measures, Datasets, and an Empirical Study Continued. In Proceedings of the LWDA 2016 Workshops: KDML, FGWM, FGIR, and FGDB, Potsdam, Germany. 2016. p. 234