Abstract
The performance of similarity measures for search, indexing, and data mining applications tends to degrade rapidly as the dimensionality of the data increases. The effects of the so-called 'curse of dimensionality' have been studied by researchers for data sets generated according to a single data distribution. In this paper, we study the effects of this phenomenon on different similarity measures for multiply-distributed data. In particular, we assess the performance of shared-neighbor similarity measures, which are secondary similarity measures based on the rankings of data objects induced by some primary distance measure. We find that rank-based similarity measures can result in more stable performance than their associated primary distance measures.
Original language | English |
---|---|
Title of host publication | Scientific and Statistical Database Management - 22nd International Conference, SSDBM 2010, Proceedings |
Editors | M. Gertz, B. Ludäscher |
Publisher | Springer |
Publication date | 3. Aug 2010 |
Pages | 482-500 |
ISBN (Print) | 978-3-642-13817-1 |
ISBN (Electronic) | 978-3-642-13818-8 |
DOIs | |
Publication status | Published - 3. Aug 2010 |
Externally published | Yes |
Event | 22nd International Conference on Scientific and Statistical Database Management - Heidelberg, Germany Duration: 30. Jun 2010 → 2. Jul 2010 Conference number: 22 |
Conference
Conference | 22nd International Conference on Scientific and Statistical Database Management |
---|---|
Number | 22 |
Country/Territory | Germany |
City | Heidelberg |
Period | 30/06/2010 → 02/07/2010 |
Sponsor | Heidelberg University , Heidelberg Institute for Theoretical Studies (HITS) |
Series | Lecture Notes in Computer Science |
---|---|
Volume | 6187 |
ISSN | 0302-9743 |
Fingerprint
Dive into the research topics of 'Can shared-neighbor distances defeat the curse of dimensionality?'. Together they form a unique fingerprint.Related datasets
-
ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of Object Images (ALOI)
Schubert, E. (Creator) & Zimek, A. (Creator), Zenodo, 30. Jun 2010
Dataset