Crowdsourcing Taxonomies

Presented at: 9th Extended Semantic Web Conference (ESWC2012)

by Dimitris Karampinas, Peter Triantafillou

Taxonomies are a useful mechanism to organize, evaluate, and search web content. As such, many popular classes of web applications, from product categorization, similar-product comparative pricing, localized services, to vertical or enterprise search, utilize them. However, their manual generation and maintenance by experts is a time-costly and cumbersome procedure, often resulting in platform-dependent and static vocabularies. Hence lots of research has been focusing currently on more flexible and dynamic methods to develop them, as evidenced for example by the huge interest of folksonomies within the social media realm. We propose a new approach for constructing taxonomies. Our idea stems from the increased human involvement and desire to provide tags and annotate web content (e.g., in social media and product categorization applications). We define the required input from human users in the form of explicit structural information; that is, supertype-subtype relationships between concepts. Humans have a good understanding of such relationships. In this way, we harvest, via common annotation practices, the collective wisdom of users with respect to the (categorization of) web content they share and access. We further define the principles upon which crowdsourced taxonomy construction algorithms should be based. We show that the resulting problem is NP-Hard. We provide heuristic algorithms and relevant optimizations that aggregate human input, resolving conflicting input, and produce taxonomies. Our algorithm's evaluation is based on real-world crowdsourcing experiments (where real users provide such information) and on real-world taxonomies.

Keywords: collective intelligence, crowdsourcing, social web, tagging, taxonomies

Resource URI on the dog food server:

Explore this resource elsewhere: