Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement

Presented at: 18th International World Wide Web Conference (WWW2009)

by Jiang Bian, Yandong Liu, Ding Zhou, Eugene Agichtein, Hongyuan Zha

Webpage: http://www2009.eprints.org/6/1/p51.pdf

Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yahoo! Answers have exploded in popularity, and now provide a viable alternative to general purpose Web search. At the same time, the answers to past questions submitted in CQA sites comprise a valuable knowledge repository which could be a gold mine for information retrieval and automatic question answering. Unfortunately, the quality of the submitted questions and answers varies widely - increasingly so that a large fraction of the content is not usable for answering queries. Previous approaches for retrieving relevant and high quality content have been proposed, but they require large amounts of manually labeled data – which limits the applicability of the supervised approaches to new sites and domains. In this paper we address this problem by developing a semi-supervised coupled mutual reinforcement framework for simultaneously calculating content quality and user reputation, that requires relatively few labeled examples to initialize the training process. Results of a large scale evaluation demonstrate that our methods are more effective than previous approaches for finding high-quality answers, questions, and users. More importantly, our quality estimation significantly improves the accuracy of search over CQA archives over the state-of-the-art methods.

Keywords: Data Mining

Resource URI on the dog food server: http://data.semanticweb.org/conference/www/2009/paper/6

Explore this resource elsewhere: