Unsupervised outlier detection has advantages over supervised algorithms as there is no need for labels. Because learning is not based on labels, the algorithms can detect outliers that follow patterns not yet seen. This is advantageous e.g., in cybersecurity, where the adversary continuously changes their approach.
The first approaches to unsupervised outlier detection were relatively simple, and the users were typically statisticians who, with their insight, could match the appropriate algorithms to a specific application. With increased amounts of data and data complexity, outlier algorithms have become much more complex, and the latest development is towards highly complex algorithms. To the user, these algorithms are often uninterpretable. This is a challenge when a data scientist apply the outlier algorithms and when end users are presented with the detected outliers. This development has led to a new field of research, explaining previously uninterpretable unsupervised outlier algorithms.
This thesis investigates explainable outlier algorithms applied in three domains: HTTP intrusion detection, consumer behaviour event detection, and image outlier object detection. We have developed a new algorithm in each domain and tested and evaluated it empirically, as close to the users as possible.
Finally, in a position paper, we discuss the what, the who, and the why of explainable outlier detection and introduce a new perspective on outlier detection interpretation and explanation.
|Translated title of the contribution||Anvendelse af Explainable Outlier Detection: En Empirisk og Teoretisk Undersøgelse af Anvendelse af Explainable Outlier Detection|
- University of Southern Denmark
- Schneider-Kamp, Peter, Principal supervisor
- Zimek, Arthur, Co-supervisor
|Publication status||Published - 11. Apr 2022|
Print copy of the full thesis is restricted to reference use in the Library.