A Structural Approach to Indexing Triples

Presented at: 9th Extended Semantic Web Conference (ESWC2012)

by Francois Picalausa, Yongming Luo, George H. L. Fletcher, Jan Hidders, Stijn Vansummeren

As an essential part of the W3C’s semantic web stack and linked data initiative, RDF data management systems (also known as triplestores) have drawn a lot of research attention. The majority of these systems use value-based indexes (e.g., B+-trees) for physical storage, and ignore many of the structural aspects present in RDF graphs. Structural indexes, on the other hand, have been successfully applied in XML and semi-structured data management to exploit structural graph information in query processing. In those settings, a structural index groups nodes in a graph based on some equivalence criterion, for example, indistinguishability with respect to some query workload (usually XPath). Motivated by this body of work, we have started the SAINT-DB project to study and develop a native RDF management system based on structural indexes. In this paper we present a principled framework for designing and using RDF structural indexes for practical fragments of SPARQL, based on recent formal structural characterizations of these fragments. We then explain how structural indexes can be incorporated in a typical query processing workflow; and discuss the design, implementation, and initial empirical evaluation of our approach.

Keywords: RDF management systems, query processing, structural indexes

Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2012/paper/research/40

Explore this resource elsewhere: