On the Efficiency of Joining Group Patterns in SPARQL Queries

Presented at: 7th Extended Semantic Web Conference (ESWC2010)

by Maria Esther Vidal, Edna Ruckhaus, Tomas Lampo, Javier Sierra, Amadis Martinez, Axel Polleres

In SPARQL queries, the combination of triple patterns is expressed by using shared variables across patterns. Based on this characterization, basic graph patterns in a SPARQL query can be partitioned into groups of acyclic pattern combinations that share exactly one variable, or star-shaped groups. We observe that the number of triples in a group is proportional to the number of individuals that play the role of the subject or the object; however, depending on the degree of participation of the subject individuals in the properties, a group could be not much larger than a class or type to which the subject or object belongs. Thus, it may be significantly more efficient to independently evaluate each of the groups, and then merge the resulting sets, than linearly joining all triples in a basic graph pattern. Based on these properties of star-shaped groups, we have developed query optimization and evaluation techniques. We have conducted an empirical analysis on the benefits of the optimization and evaluation techniques in several SPARQL query engines. We observe that our proposed techniques are able to speed up query evaluation time for join queries with star-shaped patterns by at least one order of magnitude.

Keywords: Cost Models, Query Optimization, RDF Query Engines, SPARQL

Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2010/paper/onto/57

Explore this resource elsewhere: