Corpus and Voices for Catalan Speech Synthesis

Presented at: The Sixth International Language Resources and Evaluation Conference (LREC2008)

by Antonio Bonafonte, Jordi Adell, Ignasi Esquerra, Silvia Gallego, Asuncion Moreno, Javier Pérez

Webpage: http://www.lrec-conf.org/proceedings/lrec2008/pdf/835_paper.pdf
Webpage: http://www.lrec-conf.org/proceedings/lrec2008/summaries/835.html

In this paper we describe the design and production of Catalan database for building synthetic voices. Two speakers, with 10 hours per speaker, have recorded 10 hours of speech. The speaker selection and the corpus design aim to provide resources for high quality synthesis. The resources have been used to build voices for the Festival TTS. Both the original recordings and the Festival databases are freely available for research and for commertial use.

Keywords: Corpus (creation, annotation, etc.), Endangered languages, Speech synthesis, Text-to-speech systems, Linguistics


Resource URI on the dog food server: http://data.semanticweb.org/conference/lrec/2008/papers/835


Explore this resource elsewhere: