Presented at: The Sixth International Language Resources and Evaluation Conference (LREC2008)
by Daniela Goecke, Maik Stührenberg, Andreas Witt
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/pdf/368_paper.pdfWe report the results of a study that investigates the agreement of anaphoric annotations. The study focuses on the influence of the factors text length and text type on a corpus of scientific articles and newspaper texts. In order to measure inter-annotator agreement we compare existing approaches and we propose to measure each step of the annotation process separately instead of measuring the resulting anaphoric relations only. A total amount of 3,642 anaphoric relations has been annotated for a corpus of 53,038 tokens (12,327 markables). The results of the study show that text type has more influence on inter-annotator agreement than text length. Furthermore, the definition of well-defined annotation instructions and coder training is a crucial point in order to receive good annotation results.
Keywords: Anaphora, Coreference, Corpus (creation, annotation, etc.), Validation of LRs, Linguistics
Resource URI on the dog food server: http://data.semanticweb.org/conference/lrec/2008/papers/368
Explore this resource elsewhere: