Cargando…

Extending ontologies by finding siblings using set expansion techniques

Motivation: Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers...

Descripción completa

Detalles Bibliográficos
Autores principales: Fabian, Götz, Wächter, Thomas, Schroeder, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371847/
https://www.ncbi.nlm.nih.gov/pubmed/22689774
http://dx.doi.org/10.1093/bioinformatics/bts215
_version_ 1782235269550833664
author Fabian, Götz
Wächter, Thomas
Schroeder, Michael
author_facet Fabian, Götz
Wächter, Thomas
Schroeder, Michael
author_sort Fabian, Götz
collection PubMed
description Motivation: Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers to systematically improve text-based ontologies level by level. Results: We developed an approach to extend ontologies by discovering new terms which are in a sibling relationship to existing terms of an ontology. For this purpose, we combined two approaches which retrieve new terms from the web. The first approach extracts siblings by exploiting the structure of HTML documents, whereas the second approach uses text mining techniques to extract siblings from unstructured text. Our evaluation against MeSH (Medical Subject Headings) shows that our method for sibling discovery is able to suggest first-class ontology terms and can be used as an initial step towards assessing the completeness of ontologies. The evaluation yields a recall of 80% at a precision of 61% where the two independent approaches are complementing each other. For MeSH in particular, we show that it can be considered complete in its medical focus area. We integrated the work into DOG4DAG, an ontology generation plugin for the editors OBO-Edit and Protégé, making it the first plugin that supports sibling discovery on-the-fly. Availability: Sibling discovery for ontology is available as part of DOG4DAG (www.biotec.tu-dresden.de/research/schroeder/dog4dag) for both Protégé 4.1 and OBO-Edit 2.1. Contact: ms@biotec.tu-dresden.de; goetz.fabian@biotec.tu-dresden.de Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3371847
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33718472012-06-11 Extending ontologies by finding siblings using set expansion techniques Fabian, Götz Wächter, Thomas Schroeder, Michael Bioinformatics Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa Motivation: Ontologies are an everyday tool in biomedicine to capture and represent knowledge. However, many ontologies lack a high degree of coverage in their domain and need to improve their overall quality and maturity. Automatically extending sets of existing terms will enable ontology engineers to systematically improve text-based ontologies level by level. Results: We developed an approach to extend ontologies by discovering new terms which are in a sibling relationship to existing terms of an ontology. For this purpose, we combined two approaches which retrieve new terms from the web. The first approach extracts siblings by exploiting the structure of HTML documents, whereas the second approach uses text mining techniques to extract siblings from unstructured text. Our evaluation against MeSH (Medical Subject Headings) shows that our method for sibling discovery is able to suggest first-class ontology terms and can be used as an initial step towards assessing the completeness of ontologies. The evaluation yields a recall of 80% at a precision of 61% where the two independent approaches are complementing each other. For MeSH in particular, we show that it can be considered complete in its medical focus area. We integrated the work into DOG4DAG, an ontology generation plugin for the editors OBO-Edit and Protégé, making it the first plugin that supports sibling discovery on-the-fly. Availability: Sibling discovery for ontology is available as part of DOG4DAG (www.biotec.tu-dresden.de/research/schroeder/dog4dag) for both Protégé 4.1 and OBO-Edit 2.1. Contact: ms@biotec.tu-dresden.de; goetz.fabian@biotec.tu-dresden.de Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-06-15 2012-06-09 /pmc/articles/PMC3371847/ /pubmed/22689774 http://dx.doi.org/10.1093/bioinformatics/bts215 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
Fabian, Götz
Wächter, Thomas
Schroeder, Michael
Extending ontologies by finding siblings using set expansion techniques
title Extending ontologies by finding siblings using set expansion techniques
title_full Extending ontologies by finding siblings using set expansion techniques
title_fullStr Extending ontologies by finding siblings using set expansion techniques
title_full_unstemmed Extending ontologies by finding siblings using set expansion techniques
title_short Extending ontologies by finding siblings using set expansion techniques
title_sort extending ontologies by finding siblings using set expansion techniques
topic Ismb 2012 Proceedings Papers Committee July 15 to July 19, 2012, Long Beach, Ca, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371847/
https://www.ncbi.nlm.nih.gov/pubmed/22689774
http://dx.doi.org/10.1093/bioinformatics/bts215
work_keys_str_mv AT fabiangotz extendingontologiesbyfindingsiblingsusingsetexpansiontechniques
AT wachterthomas extendingontologiesbyfindingsiblingsusingsetexpansiontechniques
AT schroedermichael extendingontologiesbyfindingsiblingsusingsetexpansiontechniques