Processing of Spatial-Keyword Range Queries in Apache Spark

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

65 Downloads (Pure)

Abstract

Big spatio-textual data are prevalent in modern applications, where spatial objects are associated with textual descriptions. For querying spatio-textual data, spatial-keyword queries have been proposed, which entail challenges mainly because of the combination of spatial and textual dimensions. Furthermore, scalable processing is a key challenge, due to the immense volume of the underlying data. In this paper, we address the problem of parallel processing of spatial-keyword range queries, which retrieve all spatio-textual objects within a user-specified distance from a query location and having a textual description of sufficient similarity with the query keywords. Our approach relies on a mapping scheme that maps spatio-textual objects to a 2D space, thus creating compact data partitions. In turn, we can exploit these partitions in order to effectively distribute the mapped data to worker nodes and parallelize processing. Our implementation is in Apache Spark and it is shown to outperform both two baseline solutions as well as two state-of-the-art systems for processing big spatial data.

OriginalsprogEngelsk
TitelBigSpatial 2023 - Proceedings of the 11th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data
RedaktørerAshwin Shashidharan, Martin Werner, Krishna Karthik Gadiraju, Varun Chandola, Ranga Raju Vatsavai
Antal sider9
ForlagAssociation for Computing Machinery
Publikationsdato13. nov. 2023
Sider23-31
ISBN (Elektronisk)9798400703454
DOI
StatusUdgivet - 13. nov. 2023
Begivenhed11th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2023 - Hamburg, Tyskland
Varighed: 13. nov. 2023 → …

Konference

Konference11th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2023
Land/OmrådeTyskland
ByHamburg
Periode13/11/2023 → …

Finansiering

This work was supported by the Horizon Europe R&I programme EMERALDS under the GA No. 101093051.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Processing of Spatial-Keyword Range Queries in Apache Spark'. Sammen danner de et unikt fingeraftryk.

Citationsformater