SCHEMA - An Algorithm for Automated Product Taxonomy Mapping in E-commerce

Presented at: 9th Extended Semantic Web Conference (ESWC2012)

by Steven Aanen, Lennart Nederstigt, Damir Vandic, Flavius Frasincar

This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. It contributes towards effective aggregation of product information from different sources, in order to reduce search failures in online shopping. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. It introduces a node matching function, based on inclusiveness of the categories in conjunction with the Levenshtein distance for class labels, for finding candidate map categories, and for assessing path-similarity. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is shown that SCHEMA improves considerably on both recall and F1-score, while maintaining similar precision.

Keywords: product taxonomy, semantic similarity, taxonomy mapping, word sense disambiguation

Resource URI on the dog food server:

Explore this resource elsewhere: