A Comparison of Various Methods for Concept Tagging for Spoken Language Understanding

Presented at: The Sixth International Language Resources and Evaluation Conference (LREC2008)

by Stefan Hahn, Patrick Lehnen, Christian Raymond, Hermann Ney

Webpage: http://www.lrec-conf.org/proceedings/lrec2008/pdf/749_paper.pdf
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/summaries/749.html

The extraction of flat concepts out of a given word sequence is usually one of the first steps in building a spoken language understanding (SLU) or dialogue system. This paper explores five different modelling approaches for this task and presents results on a French state-of-the-art corpus, MEDIA. Additionally, two log-linear modelling approaches could be further improved by adding morphologic knowledge. This paper goes beyond what has been reported in the literature. We applied the models on the same training and testing data and used the NIST scoring toolkit to evaluate the experimental results to ensure identical conditions for each of the experiments and the comparability of the results. Using a model based on conditional random fields, we achieve a concept error rate of 11.8% on the MEDIA evaluation corpus.

Keywords: Dialogue & Natural Interactivity, Speech recognition and understanding, Tagging, Linguistics


Resource URI on the dog food server: http://data.semanticweb.org/conference/lrec/2008/papers/749


Explore this resource elsewhere: