Can shared-neighbor distances defeat the curse of dimensionality?

Michael E. Houle, Hans Peter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

The performance of similarity measures for search, indexing, and data mining applications tends to degrade rapidly as the dimensionality of the data increases. The effects of the so-called 'curse of dimensionality' have been studied by researchers for data sets generated according to a single data distribution. In this paper, we study the effects of this phenomenon on different similarity measures for multiply-distributed data. In particular, we assess the performance of shared-neighbor similarity measures, which are secondary similarity measures based on the rankings of data objects induced by some primary distance measure. We find that rank-based similarity measures can result in more stable performance than their associated primary distance measures.

Original languageEnglish
Title of host publicationScientific and Statistical Database Management - 22nd International Conference, SSDBM 2010, Proceedings
EditorsM. Gertz, B. Ludäscher
PublisherSpringer
Publication date3. Aug 2010
Pages482-500
ISBN (Print)978-3-642-13817-1
ISBN (Electronic)978-3-642-13818-8
DOIs
Publication statusPublished - 3. Aug 2010
Externally publishedYes
Event22nd International Conference on Scientific and Statistical Database Management - Heidelberg, Germany
Duration: 30. Jun 20102. Jul 2010
Conference number: 22

Conference

Conference22nd International Conference on Scientific and Statistical Database Management
Number22
Country/TerritoryGermany
CityHeidelberg
Period30/06/201002/07/2010
SponsorHeidelberg University , Heidelberg Institute for Theoretical Studies (HITS)
SeriesLecture Notes in Computer Science
Volume6187
ISSN0302-9743

Fingerprint

Dive into the research topics of 'Can shared-neighbor distances defeat the curse of dimensionality?'. Together they form a unique fingerprint.

Cite this