Efficient Execution of Top-K SPARQL Queries

Presented at: The 11th International Semantic Web Conference (ISWC2012)

by Sara Magliacane, Alessandro Bozzon, Emanuele Della Valle

Webpage: http://dx.doi.org/10.1007/978-3-642-35176-1_22
Webpage: http://iswc2012.semanticweb.org/sites/default/files/76490337.pdf

Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The SPARQL-RANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for SPARQL-RANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a SPARQL-RANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.

Resource URI on the dog food server: http://data.semanticweb.org/conference/iswc/2012/proceedings-2/paper-10

Explore this resource elsewhere: