Good and Bad Neighborhood Approximations for Outlier Detection Ensembles

Evelyn Kirner, Erich Schubert, Arthur Zimek

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Outlier detection methods have used approximate neighborhoods in filter-refinement approaches. Outlier detection ensembles have used artificially obfuscated neighborhoods to achieve diverse ensemble members. Here we argue that outlier detection models could be based on approximate neighborhoods in the first place, thus gaining in both efficiency and effectiveness. It depends, however, on the type of approximation, as only some seem beneficial for the task of outlier detection, while no (large) benefit can be seen for others. In particular, we argue that space-filling curves are beneficial approximations, as they have a stronger tendency to underestimate the density in sparse regions than in dense regions. In comparison, LSH and NN-Descent do not have such a tendency and do not seem to be beneficial for the construction of outlier detection ensembles.
Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Similarity Search and Applications
EditorsChristian Beecks, Felix Borutta, Peer Kröger, Thomas Seidl
Place of PublicationCham
PublisherSpringer
Publication date2017
Pages173-187
ISBN (Print)978-3-319-68473-4
ISBN (Electronic)978-3-319-68474-1
DOIs
Publication statusPublished - 2017
Event10th International Conference on Similarity Search and Applications - Munich, Germany
Duration: 4 Oct 20176 Oct 2017
Conference number: 10
http://www.sisap.org/2017/index.html

Conference

Conference10th International Conference on Similarity Search and Applications
Number10
CountryGermany
CityMunich
Period04/10/201706/10/2017
Internet address
SeriesLecture Notes in Computer Science
Volume10609
ISSN0302-9743

Fingerprint

outlier
detection method
detection
filter

Cite this

Kirner, E., Schubert, E., & Zimek, A. (2017). Good and Bad Neighborhood Approximations for Outlier Detection Ensembles. In C. Beecks, F. Borutta, P. Kröger, & T. Seidl (Eds.), Proceedings of the 10th International Conference on Similarity Search and Applications (pp. 173-187). Cham: Springer. Lecture Notes in Computer Science, Vol.. 10609 https://doi.org/10.1007/978-3-319-68474-1_12
Kirner, Evelyn ; Schubert, Erich ; Zimek, Arthur. / Good and Bad Neighborhood Approximations for Outlier Detection Ensembles. Proceedings of the 10th International Conference on Similarity Search and Applications. editor / Christian Beecks ; Felix Borutta ; Peer Kröger ; Thomas Seidl. Cham : Springer, 2017. pp. 173-187 (Lecture Notes in Computer Science, Vol. 10609).
@inproceedings{a43988658ad945e6bacd8bf81c91b534,
title = "Good and Bad Neighborhood Approximations for Outlier Detection Ensembles",
abstract = "Outlier detection methods have used approximate neighborhoods in filter-refinement approaches. Outlier detection ensembles have used artificially obfuscated neighborhoods to achieve diverse ensemble members. Here we argue that outlier detection models could be based on approximate neighborhoods in the first place, thus gaining in both efficiency and effectiveness. It depends, however, on the type of approximation, as only some seem beneficial for the task of outlier detection, while no (large) benefit can be seen for others. In particular, we argue that space-filling curves are beneficial approximations, as they have a stronger tendency to underestimate the density in sparse regions than in dense regions. In comparison, LSH and NN-Descent do not have such a tendency and do not seem to be beneficial for the construction of outlier detection ensembles.",
author = "Evelyn Kirner and Erich Schubert and Arthur Zimek",
year = "2017",
doi = "10.1007/978-3-319-68474-1_12",
language = "English",
isbn = "978-3-319-68473-4",
pages = "173--187",
editor = "Christian Beecks and Felix Borutta and Peer Kr{\"o}ger and Thomas Seidl",
booktitle = "Proceedings of the 10th International Conference on Similarity Search and Applications",
publisher = "Springer",
address = "Germany",

}

Kirner, E, Schubert, E & Zimek, A 2017, Good and Bad Neighborhood Approximations for Outlier Detection Ensembles. in C Beecks, F Borutta, P Kröger & T Seidl (eds), Proceedings of the 10th International Conference on Similarity Search and Applications. Springer, Cham, Lecture Notes in Computer Science, vol. 10609, pp. 173-187, 10th International Conference on Similarity Search and Applications, Munich, Germany, 04/10/2017. https://doi.org/10.1007/978-3-319-68474-1_12

Good and Bad Neighborhood Approximations for Outlier Detection Ensembles. / Kirner, Evelyn; Schubert, Erich; Zimek, Arthur.

Proceedings of the 10th International Conference on Similarity Search and Applications. ed. / Christian Beecks; Felix Borutta; Peer Kröger; Thomas Seidl. Cham : Springer, 2017. p. 173-187.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - Good and Bad Neighborhood Approximations for Outlier Detection Ensembles

AU - Kirner, Evelyn

AU - Schubert, Erich

AU - Zimek, Arthur

PY - 2017

Y1 - 2017

N2 - Outlier detection methods have used approximate neighborhoods in filter-refinement approaches. Outlier detection ensembles have used artificially obfuscated neighborhoods to achieve diverse ensemble members. Here we argue that outlier detection models could be based on approximate neighborhoods in the first place, thus gaining in both efficiency and effectiveness. It depends, however, on the type of approximation, as only some seem beneficial for the task of outlier detection, while no (large) benefit can be seen for others. In particular, we argue that space-filling curves are beneficial approximations, as they have a stronger tendency to underestimate the density in sparse regions than in dense regions. In comparison, LSH and NN-Descent do not have such a tendency and do not seem to be beneficial for the construction of outlier detection ensembles.

AB - Outlier detection methods have used approximate neighborhoods in filter-refinement approaches. Outlier detection ensembles have used artificially obfuscated neighborhoods to achieve diverse ensemble members. Here we argue that outlier detection models could be based on approximate neighborhoods in the first place, thus gaining in both efficiency and effectiveness. It depends, however, on the type of approximation, as only some seem beneficial for the task of outlier detection, while no (large) benefit can be seen for others. In particular, we argue that space-filling curves are beneficial approximations, as they have a stronger tendency to underestimate the density in sparse regions than in dense regions. In comparison, LSH and NN-Descent do not have such a tendency and do not seem to be beneficial for the construction of outlier detection ensembles.

U2 - 10.1007/978-3-319-68474-1_12

DO - 10.1007/978-3-319-68474-1_12

M3 - Article in proceedings

SN - 978-3-319-68473-4

SP - 173

EP - 187

BT - Proceedings of the 10th International Conference on Similarity Search and Applications

A2 - Beecks, Christian

A2 - Borutta, Felix

A2 - Kröger, Peer

A2 - Seidl, Thomas

PB - Springer

CY - Cham

ER -

Kirner E, Schubert E, Zimek A. Good and Bad Neighborhood Approximations for Outlier Detection Ensembles. In Beecks C, Borutta F, Kröger P, Seidl T, editors, Proceedings of the 10th International Conference on Similarity Search and Applications. Cham: Springer. 2017. p. 173-187. (Lecture Notes in Computer Science, Vol. 10609). https://doi.org/10.1007/978-3-319-68474-1_12