Large Scale Multi-Label Classification via MetaLabeler

Presented at: 18th International World Wide Web Conference (WWW2009)

by Lei Tang, Suju Rajan, Vijay K. Narayanan


The explosion of online content has made the management of such content non-trivial. Web-related tasks such as web page categorization, news filtering, query categorization, tag recommendation, etc. often involve the construction of multilabel categorization systems on a large scale. Existing multilabel classification methods either do not scale or have unsatisfactory performance. In this work, we propose MetaLabeler to automatically determine the relevant set of labels for each instance without intensive human involvement or expensive cross-validation. Extensive experiments conducted on benchmark data show that the MetaLabeler tends to outperform existing methods. Moreover, MetaLabeler scales to millions of multi-labeled instances and can be deployed easily. This enables us to apply the MetaLabeler to a large scale query categorization problem in Yahoo!, yielding a significant improvement in performance.

Keywords: Data Mining

Resource URI on the dog food server:

Explore this resource elsewhere: