Cargando…

Automating generation of textual class definitions from OWL to English

BACKGROUND: Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontol...

Descripción completa

Detalles Bibliográficos
Autores principales: Stevens, Robert, Malone, James, Williams, Sandra, Power, Richard, Third, Allan
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3102894/
https://www.ncbi.nlm.nih.gov/pubmed/21624160
http://dx.doi.org/10.1186/2041-1480-2-S2-S5
_version_ 1782204448391561216
author Stevens, Robert
Malone, James
Williams, Sandra
Power, Richard
Third, Allan
author_facet Stevens, Robert
Malone, James
Williams, Sandra
Power, Richard
Third, Allan
author_sort Stevens, Robert
collection PubMed
description BACKGROUND: Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontology (logical descriptions and definitions of entities) and the development of the text-based definitions. The goal of natural language generation (NLG) from ontologies is to take the logical description of entities and generate fluent natural language. The application described here uses NLG to automatically provide text-based definitions from an ontology that has logical descriptions of its entities, so avoiding the bottleneck of authoring these definitions by hand. RESULTS: To produce the descriptions, the program collects all the axioms relating to a given entity, groups them according to common structure, realises each group through an English sentence, and assembles the resulting sentences into a paragraph, to form as ‘coherent’ a text as possible without human intervention. Sentence generation is accomplished using a generic grammar based on logical patterns in OWL, together with a lexicon for realising atomic entities. We have tested our output for the Experimental Factor Ontology (EFO) using a simple survey strategy to explore the fluency of the generated text and how well it conveys the underlying axiomatisation. Two rounds of survey and improvement show that overall the generated English definitions are found to convey the intended meaning of the axiomatisation in a satisfactory manner. The surveys also suggested that one form of generated English will not be universally liked; that intrusion of too much ‘formal ontology’ was not liked; and that too much explicit exposure of OWL semantics was also not liked. CONCLUSIONS: Our prototype tools can generate reasonable paragraphs of English text that can act as definitions. The definitions were found acceptable by our survey and, as a result, the developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point. AVAILABILITY: An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html.
format Text
id pubmed-3102894
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31028942011-05-28 Automating generation of textual class definitions from OWL to English Stevens, Robert Malone, James Williams, Sandra Power, Richard Third, Allan J Biomed Semantics Proceedings BACKGROUND: Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontology (logical descriptions and definitions of entities) and the development of the text-based definitions. The goal of natural language generation (NLG) from ontologies is to take the logical description of entities and generate fluent natural language. The application described here uses NLG to automatically provide text-based definitions from an ontology that has logical descriptions of its entities, so avoiding the bottleneck of authoring these definitions by hand. RESULTS: To produce the descriptions, the program collects all the axioms relating to a given entity, groups them according to common structure, realises each group through an English sentence, and assembles the resulting sentences into a paragraph, to form as ‘coherent’ a text as possible without human intervention. Sentence generation is accomplished using a generic grammar based on logical patterns in OWL, together with a lexicon for realising atomic entities. We have tested our output for the Experimental Factor Ontology (EFO) using a simple survey strategy to explore the fluency of the generated text and how well it conveys the underlying axiomatisation. Two rounds of survey and improvement show that overall the generated English definitions are found to convey the intended meaning of the axiomatisation in a satisfactory manner. The surveys also suggested that one form of generated English will not be universally liked; that intrusion of too much ‘formal ontology’ was not liked; and that too much explicit exposure of OWL semantics was also not liked. CONCLUSIONS: Our prototype tools can generate reasonable paragraphs of English text that can act as definitions. The definitions were found acceptable by our survey and, as a result, the developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point. AVAILABILITY: An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html. BioMed Central 2011-05-17 /pmc/articles/PMC3102894/ /pubmed/21624160 http://dx.doi.org/10.1186/2041-1480-2-S2-S5 Text en Copyright ©2011 Stevens et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Stevens, Robert
Malone, James
Williams, Sandra
Power, Richard
Third, Allan
Automating generation of textual class definitions from OWL to English
title Automating generation of textual class definitions from OWL to English
title_full Automating generation of textual class definitions from OWL to English
title_fullStr Automating generation of textual class definitions from OWL to English
title_full_unstemmed Automating generation of textual class definitions from OWL to English
title_short Automating generation of textual class definitions from OWL to English
title_sort automating generation of textual class definitions from owl to english
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3102894/
https://www.ncbi.nlm.nih.gov/pubmed/21624160
http://dx.doi.org/10.1186/2041-1480-2-S2-S5
work_keys_str_mv AT stevensrobert automatinggenerationoftextualclassdefinitionsfromowltoenglish
AT malonejames automatinggenerationoftextualclassdefinitionsfromowltoenglish
AT williamssandra automatinggenerationoftextualclassdefinitionsfromowltoenglish
AT powerrichard automatinggenerationoftextualclassdefinitionsfromowltoenglish
AT thirdallan automatinggenerationoftextualclassdefinitionsfromowltoenglish