Robust Statistical Scaling of Outlier Scores: Improving the Quality of Outlier Probabilities for Outliers

Philipp Röchner*, Henrique O. Marques, Ricardo J. G. B. Campello, Arthur Zimek, Franz Rothlauf

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Abstract

Outlier detection algorithms typically assign an outlier score to each observation in a dataset, indicating the degree to which an observation is an outlier. However, these scores are often not comparable across algorithms and can be difficult for humans to interpret. Statistical scaling addresses this problem by transforming outlier scores into outlier probabilities without using ground-truth labels, thereby improving interpretability and comparability across algorithms. However, the quality of this transformation can be different for outliers and inliers. Missing outliers in scenarios where they are of particular interest—such as healthcare, finance, or engineering—can be costly or dangerous. Thus, ensuring good probabilities for outliers is essential. This paper argues that statistical scaling, as commonly used in the literature, does not produce equally good probabilities for outliers as for inliers. Therefore, we propose robust statistical scaling, which uses robust estimators to improve the probabilities for outliers. We evaluate several variants of our method against other outlier score transformations for real-world datasets and outlier detection algorithms, where it can improve the probabilities for outliers.
Original languageEnglish
Title of host publicationSimilarity Search and Applications
EditorsEdgar Chávez, Benjamin Kimia, Jakub Lokoč, Marco Patella, Jan Sedmidubsky
PublisherSpringer
Publication date25. Oct 2024
Pages215-222
ISBN (Print)978-3-031-75822-5
ISBN (Electronic)978-3-031-75823-2
DOIs
Publication statusPublished - 25. Oct 2024
Event17th International Conference of Similarity Search and Applications - Providence, United States
Duration: 4. Nov 20246. Nov 2024

Conference

Conference17th International Conference of Similarity Search and Applications
Country/TerritoryUnited States
CityProvidence
Period04/11/202406/11/2024
SeriesLecture Notes in Computer Science
Volume15268
ISSN0302-9743

Fingerprint

Dive into the research topics of 'Robust Statistical Scaling of Outlier Scores: Improving the Quality of Outlier Probabilities for Outliers'. Together they form a unique fingerprint.

Cite this