On evaluation of outlier rankings and outlier scores

Erich Schubert, Remigius Wojdanowski, Arthur Zimek, Hans Peter Kriegel

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Outlier detection research is currently focusing on the development of new methods and on improving the computation time for these methods. Evaluation however is rather heuristic, often considering just precision in the top k results or using the area under the ROC curve. These evaluation procedures do not allow for assessment of similarity between methods. Judging the similarity of or correlation between two rankings of outlier scores is an important question in itself but it is also an essential step towards meaningfully building outlier detection ensembles, where this aspect has been completely ignored so far. In this study, our generalized view of evaluation methods allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance. Our new evaluation framework takes into consideration the class imbalance problem and offers new insights on similarity and redundancy of existing outlier detection methods. As a result, the design of effective ensemble methods for outlier detection is considerably enhanced.

Original languageEnglish
Title of host publicationProceedings of the 12th SIAM International Conference on Data Mining
EditorsJoydeep Ghosh, Huan Liu, Ian Davidson, Carlotta Domeniconi, Chandrika Kamath
Publication dateDec 2012
Pages1047-1058
ISBN (Print)9781611972320
ISBN (Electronic)978-1-61197-282-5
DOIs
Publication statusPublished - Dec 2012
Externally publishedYes
Event12th SIAM International Conference on Data Mining - Anaheim, United States
Duration: 26. Apr 201228. Apr 2012

Conference

Conference12th SIAM International Conference on Data Mining
CountryUnited States
CityAnaheim
Period26/04/201228/04/2012
SponsorAmerican Statistical Association

Fingerprint

Redundancy

Cite this

Schubert, E., Wojdanowski, R., Zimek, A., & Kriegel, H. P. (2012). On evaluation of outlier rankings and outlier scores. In J. Ghosh, H. Liu, I. Davidson, C. Domeniconi, & C. Kamath (Eds.), Proceedings of the 12th SIAM International Conference on Data Mining (pp. 1047-1058) https://doi.org/10.1137/1.9781611972825.90
Schubert, Erich ; Wojdanowski, Remigius ; Zimek, Arthur ; Kriegel, Hans Peter. / On evaluation of outlier rankings and outlier scores. Proceedings of the 12th SIAM International Conference on Data Mining. editor / Joydeep Ghosh ; Huan Liu ; Ian Davidson ; Carlotta Domeniconi ; Chandrika Kamath. 2012. pp. 1047-1058
@inproceedings{7bc18fae84294523bae756979b07546e,
title = "On evaluation of outlier rankings and outlier scores",
abstract = "Outlier detection research is currently focusing on the development of new methods and on improving the computation time for these methods. Evaluation however is rather heuristic, often considering just precision in the top k results or using the area under the ROC curve. These evaluation procedures do not allow for assessment of similarity between methods. Judging the similarity of or correlation between two rankings of outlier scores is an important question in itself but it is also an essential step towards meaningfully building outlier detection ensembles, where this aspect has been completely ignored so far. In this study, our generalized view of evaluation methods allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance. Our new evaluation framework takes into consideration the class imbalance problem and offers new insights on similarity and redundancy of existing outlier detection methods. As a result, the design of effective ensemble methods for outlier detection is considerably enhanced.",
author = "Erich Schubert and Remigius Wojdanowski and Arthur Zimek and Kriegel, {Hans Peter}",
year = "2012",
month = "12",
doi = "10.1137/1.9781611972825.90",
language = "English",
isbn = "9781611972320",
pages = "1047--1058",
editor = "Joydeep Ghosh and Huan Liu and Ian Davidson and Carlotta Domeniconi and Chandrika Kamath",
booktitle = "Proceedings of the 12th SIAM International Conference on Data Mining",

}

Schubert, E, Wojdanowski, R, Zimek, A & Kriegel, HP 2012, On evaluation of outlier rankings and outlier scores. in J Ghosh, H Liu, I Davidson, C Domeniconi & C Kamath (eds), Proceedings of the 12th SIAM International Conference on Data Mining. pp. 1047-1058, 12th SIAM International Conference on Data Mining, Anaheim, United States, 26/04/2012. https://doi.org/10.1137/1.9781611972825.90

On evaluation of outlier rankings and outlier scores. / Schubert, Erich; Wojdanowski, Remigius; Zimek, Arthur; Kriegel, Hans Peter.

Proceedings of the 12th SIAM International Conference on Data Mining. ed. / Joydeep Ghosh; Huan Liu; Ian Davidson; Carlotta Domeniconi; Chandrika Kamath. 2012. p. 1047-1058.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - On evaluation of outlier rankings and outlier scores

AU - Schubert, Erich

AU - Wojdanowski, Remigius

AU - Zimek, Arthur

AU - Kriegel, Hans Peter

PY - 2012/12

Y1 - 2012/12

N2 - Outlier detection research is currently focusing on the development of new methods and on improving the computation time for these methods. Evaluation however is rather heuristic, often considering just precision in the top k results or using the area under the ROC curve. These evaluation procedures do not allow for assessment of similarity between methods. Judging the similarity of or correlation between two rankings of outlier scores is an important question in itself but it is also an essential step towards meaningfully building outlier detection ensembles, where this aspect has been completely ignored so far. In this study, our generalized view of evaluation methods allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance. Our new evaluation framework takes into consideration the class imbalance problem and offers new insights on similarity and redundancy of existing outlier detection methods. As a result, the design of effective ensemble methods for outlier detection is considerably enhanced.

AB - Outlier detection research is currently focusing on the development of new methods and on improving the computation time for these methods. Evaluation however is rather heuristic, often considering just precision in the top k results or using the area under the ROC curve. These evaluation procedures do not allow for assessment of similarity between methods. Judging the similarity of or correlation between two rankings of outlier scores is an important question in itself but it is also an essential step towards meaningfully building outlier detection ensembles, where this aspect has been completely ignored so far. In this study, our generalized view of evaluation methods allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance. Our new evaluation framework takes into consideration the class imbalance problem and offers new insights on similarity and redundancy of existing outlier detection methods. As a result, the design of effective ensemble methods for outlier detection is considerably enhanced.

U2 - 10.1137/1.9781611972825.90

DO - 10.1137/1.9781611972825.90

M3 - Article in proceedings

AN - SCOPUS:84874048074

SN - 9781611972320

SP - 1047

EP - 1058

BT - Proceedings of the 12th SIAM International Conference on Data Mining

A2 - Ghosh, Joydeep

A2 - Liu, Huan

A2 - Davidson, Ian

A2 - Domeniconi, Carlotta

A2 - Kamath, Chandrika

ER -

Schubert E, Wojdanowski R, Zimek A, Kriegel HP. On evaluation of outlier rankings and outlier scores. In Ghosh J, Liu H, Davidson I, Domeniconi C, Kamath C, editors, Proceedings of the 12th SIAM International Conference on Data Mining. 2012. p. 1047-1058 https://doi.org/10.1137/1.9781611972825.90