Discriminative features for identifying and interpreting outliers

Xuan Hong Dang, Ira Assent, Raymond T. Ng, Arthur Zimek, Erich Schubert

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.

Original languageEnglish
Title of host publicationProceedings of the 2014 IEEE 30th International Conference on Data Engineering
PublisherIEEE
Publication date14. May 2014
Pages88-99
ISBN (Print)978-1-4799-2555-1
DOIs
Publication statusPublished - 14. May 2014
Externally publishedYes
Event30th IEEE International Conference on Data Engineering - Chicago, United States
Duration: 31. Mar 20144. Apr 2014

Conference

Conference30th IEEE International Conference on Data Engineering
CountryUnited States
CityChicago
Period31/03/201404/04/2014
SponsorPurdue University, Google Inc., HERE/Nokia, Northwestern University , Microsoft, Qatar Computing Research Institute
SeriesProceedings of the International Conference on Data Engineering
ISSN1063-6382

Fingerprint

Graph theory

Cite this

Dang, X. H., Assent, I., Ng, R. T., Zimek, A., & Schubert, E. (2014). Discriminative features for identifying and interpreting outliers. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering (pp. 88-99). IEEE. Proceedings of the International Conference on Data Engineering https://doi.org/10.1109/ICDE.2014.6816642
Dang, Xuan Hong ; Assent, Ira ; Ng, Raymond T. ; Zimek, Arthur ; Schubert, Erich. / Discriminative features for identifying and interpreting outliers. Proceedings of the 2014 IEEE 30th International Conference on Data Engineering. IEEE, 2014. pp. 88-99 (Proceedings of the International Conference on Data Engineering).
@inproceedings{c56959206c274274b85600b56a6de0ab,
title = "Discriminative features for identifying and interpreting outliers",
abstract = "We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.",
author = "Dang, {Xuan Hong} and Ira Assent and Ng, {Raymond T.} and Arthur Zimek and Erich Schubert",
year = "2014",
month = "5",
day = "14",
doi = "10.1109/ICDE.2014.6816642",
language = "English",
isbn = "978-1-4799-2555-1",
series = "Proceedings of the International Conference on Data Engineering",
publisher = "IEEE",
pages = "88--99",
booktitle = "Proceedings of the 2014 IEEE 30th International Conference on Data Engineering",
address = "United States",

}

Dang, XH, Assent, I, Ng, RT, Zimek, A & Schubert, E 2014, Discriminative features for identifying and interpreting outliers. in Proceedings of the 2014 IEEE 30th International Conference on Data Engineering. IEEE, Proceedings of the International Conference on Data Engineering, pp. 88-99, 30th IEEE International Conference on Data Engineering, Chicago, United States, 31/03/2014. https://doi.org/10.1109/ICDE.2014.6816642

Discriminative features for identifying and interpreting outliers. / Dang, Xuan Hong; Assent, Ira; Ng, Raymond T.; Zimek, Arthur; Schubert, Erich.

Proceedings of the 2014 IEEE 30th International Conference on Data Engineering. IEEE, 2014. p. 88-99 (Proceedings of the International Conference on Data Engineering).

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - Discriminative features for identifying and interpreting outliers

AU - Dang, Xuan Hong

AU - Assent, Ira

AU - Ng, Raymond T.

AU - Zimek, Arthur

AU - Schubert, Erich

PY - 2014/5/14

Y1 - 2014/5/14

N2 - We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.

AB - We consider the problem of outlier detection and interpretation. While most existing studies focus on the first problem, we simultaneously address the equally important challenge of outlier interpretation. We propose an algorithm that uncovers outliers in subspaces of reduced dimensionality in which they are well discriminated from regular objects while at the same time retaining the natural local structure of the original data to ensure the quality of outlier explanation. Our algorithm takes a mathematically appealing approach from the spectral graph embedding theory and we show that it achieves the globally optimal solution for the objective of subspace learning. By using a number of real-world datasets, we demonstrate its appealing performance not only w.r.t. the outlier detection rate but also w.r.t. the discriminative human-interpretable features. This is the first approach to exploit discriminative features for both outlier detection and interpretation, leading to better understanding of how and why the hidden outliers are exceptional.

U2 - 10.1109/ICDE.2014.6816642

DO - 10.1109/ICDE.2014.6816642

M3 - Article in proceedings

AN - SCOPUS:84901756572

SN - 978-1-4799-2555-1

T3 - Proceedings of the International Conference on Data Engineering

SP - 88

EP - 99

BT - Proceedings of the 2014 IEEE 30th International Conference on Data Engineering

PB - IEEE

ER -

Dang XH, Assent I, Ng RT, Zimek A, Schubert E. Discriminative features for identifying and interpreting outliers. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering. IEEE. 2014. p. 88-99. (Proceedings of the International Conference on Data Engineering). https://doi.org/10.1109/ICDE.2014.6816642