Evaluation of a Cross-lingual Romanian-English Multi-document Summariser

Presented at: The Sixth International Language Resources and Evaluation Conference (LREC2008)

by Constantin Orăsan, Oana Andreea Chiorean

Webpage: http://www.lrec-conf.org/proceedings/lrec2008/pdf/539_paper.pdf
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/slides/539.ppt
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/summaries/539.html

The rapid growth of the Internet means that more information is available than ever before. Multilingual multi-document summarisation offers a way to access this information even when it is not in a language spoken by the reader by extracting the gist from related documents and translating it automatically. This paper presents an experiment in which Maximal Marginal Relevance (MMR), a well known multi-document summarisation method, is used to produce summaries from Romanian news articles. A task-based evaluation performed on both the original summaries and on their automatically translated versions reveals that they still contain a significant portion of the important information from the original texts. However, direct evaluation of the automatically translated summaries shows that they are not very legible and this can put off some readers who want to find out more about a topic.

Keywords: Machine Translation, SpeechToSpeech Translation, Summarisation, Linguistics


Resource URI on the dog food server: http://data.semanticweb.org/conference/lrec/2008/papers/539


Explore this resource elsewhere: