What have Innsbruck and Leipzig in common? Extracting Semantics from Wiki Content

Presented at: 4th European Semantic Web Conference (ESWC2007)

by Sören Auer, Jens Lehmann

Webpage: http://www.eswc2007.org/pdf/eswc07-auer.pdf

Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.

Keywords: Browsing, Knowledge extraction, Querying, Wiki, Wikipedia

Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2007/paper-284

