Presented at: The Sixth International Language Resources and Evaluation Conference (LREC2008)
by Daniel Zeman
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/pdf/66_paper.pdfPart-of-speech or morphological tags are important means of annotation in a vast number of corpora. However, different sets of tags are used in different corpora, even for the same language. Tagset conversion is difficult, and solutions tend to be tailored to a particular pair of tagsets. We propose a universal approach that makes the conversion tools reusable. We also provide an indirect evaluation in the context of a parsing task.
Keywords: Corpus (creation, annotation, etc.), Standards for LRs, Tagging, Linguistics
Resource URI on the dog food server: http://data.semanticweb.org/conference/lrec/2008/papers/66
Explore this resource elsewhere: