Domain Specific Data Retrieval on the Semantic Web

Presented at: 9th Extended Semantic Web Conference (ESWC2012)

by Tuukka Ruotsalo

The Web content no longer consists of only general text documents, but increasingly structure domain specific data published in the Linked Open Data (LOD) cloud. Data collections in this cloud are, by definition, from dif- ferent domains and indexed with domain specific ontologies and schemas. Such data representation requires retrieval methods that can operate on structured data and semantic feature spaces and remain effective even for small domain specific collections. Unlike previous research, that has concentrated on extending text search by using ontologies as a source for query expansion, we introduce a re- trieval framework based on the well known vector space model of information retrieval to fully support retrieval for Semantic Web data described in Resource Description Framework (RDF) language. We propose an indexing structure, a ranking method, and a way to incorporate reasoning and query expansion in the framework. We evaluate the approach in ad-hoc search using a cultural heritage data collection. Compared to a baseline, experimental results show up to 77% improvement when a combination of reasoning and query expansion is used.

Keywords: Domain Specific Data Retrieval, Query expansion, Semantic Web, Semantic search, Web Data Retrieval

