SparqPlug: Generating Linked Data from Legacy HTML, SPARQL and the DOM

Presented at: Linked Data on the Web (LDOW2008)

by Peter Coetzee, Tom Heath, Enrico Motta

Webpage: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-369/paper05.pdf

The availability of linked RDF data remains a significant barrier to the realisation of a Semantic Web. In this paper we present SparqPlug, a framework that uses the SPARQL query language and the HTML Document Object Model to convert legacy HTML-only data sets into RDF. This approach improves upon existing approaches in a number of ways. For example, it allows HTML data to be queried using the full flexibility of SPARQL and makes converted data automatically available in the Semantic Web. We outline the process of mass generation of RDF from HTML using SparqPlug and illustrate this with a case study. The paper concludes with an examination of factors affecting SparqPlug's performance across various forms of HTML data.


Resource URI on the dog food server: http://data.semanticweb.org/workshop/LDOW/2008/paper/13


Explore this resource elsewhere: