Distributed Human Computation Framework for Linked Data Co-reference Resolution

Presented at: 8th Extended Semantic Web Conference (ESWC2011)

by Yang Yang, Wendy Hall, Nigel Shadbolt, Priyanka Singh, Jiadi Yao, Au Yeung Ching-man, Amir Zareian, Xiaowei Wang, Zhonglun Cai, Manuel Salvadores, Nicholas Gibbins

Abstract. Distributed Human Computation (DHC) is used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI has many research problems that are considered as AI-complete. E.g. co-reference resolution, which involves determining whether different URIs refer to the same entity, is a significant hurdle to overcome in the re- alisation of large-scale Semantic Web applications. In this paper, we pro- pose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the con- cept, we are focusing on handling the co-reference resolution when inte- grating distributed datasets. Traditionally machine-learning algorithms are used as a solution for this but they are often computationally expen- sive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co- reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various pub- lication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a dis- tributed manner. The aggregated results are dereferenceable in the Open Linked Data Cloud.

Keywords: Co-reference, Crowd-sourcing, DHC, Distributed Human computation, Linked Data

Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2011/paper/digital-libraries/10

Explore this resource elsewhere: