Presented at: 11th ESWC 2014 (ESWC2014)
A primary challenge to Web data integration is coreference resolution, namely identifying entity descriptions from different data sources that refer to the same real-world entity. Increasingly, solutions to coreference resolution have humans in the loop. For instance, many active learning, crowdsourcing, and pay-as-you-go approaches solicit user feedback for verifying candidate coreferent entities computed by automatic methods. Whereas reducing the number of verification tasks is a major consideration for these approaches, very little attention has been paid to the efficiency of performing each single verification task. To address this issue, in this paper, instead of showing the entire descriptions of two entities for verification which are possibly lengthy, we propose to extract and present a compact summary of them, and expect that such length-limited comparative entity summaries can help human users verify more efficiently without significantly hurting the accuracy of their verification. Our approach exploits the common and different features of two entities that best help indicate (non-)coreference, and also considers the diverse information on their identities. Experimental results show that verification is 2.7--2.9 times faster when using our comparative entity summaries, and its accuracy is not notably affected.
Keywords: comparative entity summary, entity summarization, entity consolidation, coreference resolution
Resource URI on the dog food server: http://data.semanticweb.org/conference/eswc/2014/paper/research/26
Explore this resource elsewhere: