Mining Semantic Descriptions of Bioinformatics Web Services from the Literature

Presented at: 6th Annual European Semantic Web Conference (ESWC2009)

by Hammad Afzal, Robert Stevens, Goran Nenadic

A number of projects (myGrid, BioMOBY, etc.) have been initiated to organise emerging bioinformatics Web Services and provide their semantic descriptions. They typically rely on manual curation efforts. In this paper we focus on a semi-automated approach to mine semantic descriptions from the bioinformatics literature. The method combines terminological processing and dependency parsing of journal articles, and applies information extraction techniques to profile Web services using informative textual passages, related ontological annotations and service descriptors. Service descriptors are terminological phrases reflecting specific roles (e.g. input, output, etc.) of the related semantic classes (e.g. algorithm, database, etc.). They can be used to facilitate subsequent manual description of services, but also for providing a semantic synopsis of a service that can be used to locate related services. We present a case-study involving a subset of full text articles from the BMC Bioinformatics journal. We illustrate the potential of natural language processing not only for mining descriptions of known services, but also for discovering new services that have been described in the literature.

Keywords: Bioinformatics Web Services, Natural Language Processing, Ontology-based meta data generation, Service Description, Service Discovery, Annotation, Application, Data, E-Government, E-Learning, E-Science, E-Health, Electronic business, Electronic Business, Language Technology, Machine Learning, Natural language processing, NLP, Ontology (computer science), Ontology (Computer Science), Semantic Web, Semantic Web Services, Web service, Web Service

