Optimizing Query Shortcuts in RDF Databases

Presented at: 8th Extended Semantic Web Conference (ESWC2011)

by Vicky Dritsou, Panos Constantopoulos, Antonios Deligiannakis, Yannis Kotidis

The emergence of the Semantic Web has led to the creation of large semantic knowledge bases, often in the form of RDF databases. Improving the performance of RDF databases necessitates the development of specialized data management techniques, such as the use of shortcuts in the place of path queries. In this paper we deal with the problem of selecting the most beneficial shortcuts that reduce the execution cost of path queries in RDF databases given a space constraint. We first demonstrate that this problem is an instance of the quadratic knapsack problem. Given the computational complexity of solving such problems, we then develop an alternative formulation based on a bi-criterion linear relaxation, which essentially seeks to minimize a weighted sum of the query cost and of the required space consumption. As we demonstrate in this paper, this relaxation leads to very efficient classes of linear programming solutions. We utilize this bi-criterion linear relaxation in an algorithm that selects a subset of shortcuts to materialize. This shortcut selection algorithm is extensively evaluated and compared with a greedy algorithm that we developed in prior work. The reported experiments show that the linear relaxation algorithm manages to significantly reduce the query execution times, while also outperforming the greedy solution.

Keywords: Path Queries, Query Cost Reduction, RDF Databases

Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2011/paper/semantic-data-management/26

Explore this resource elsewhere: