Cargando…
Semi-automated ontology generation within OBO-Edit
Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/ https://www.ncbi.nlm.nih.gov/pubmed/20529942 http://dx.doi.org/10.1093/bioinformatics/btq188 |
_version_ | 1782182109163552768 |
---|---|
author | Wächter, Thomas Schroeder, Michael |
author_facet | Wächter, Thomas Schroeder, Michael |
author_sort | Wächter, Thomas |
collection | PubMed |
description | Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. Results: We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent–child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent–child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child–ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent–child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. Availability: DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org Contact: thomas.waechter@biotec.tu-dresden.de; Supplementary Information: Supplementary data are available at Bioinformatics online. |
format | Text |
id | pubmed-2881373 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-28813732010-06-08 Semi-automated ontology generation within OBO-Edit Wächter, Thomas Schroeder, Michael Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. Results: We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent–child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent–child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child–ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent–child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. Availability: DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org Contact: thomas.waechter@biotec.tu-dresden.de; Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881373/ /pubmed/20529942 http://dx.doi.org/10.1093/bioinformatics/btq188 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Wächter, Thomas Schroeder, Michael Semi-automated ontology generation within OBO-Edit |
title | Semi-automated ontology generation within OBO-Edit |
title_full | Semi-automated ontology generation within OBO-Edit |
title_fullStr | Semi-automated ontology generation within OBO-Edit |
title_full_unstemmed | Semi-automated ontology generation within OBO-Edit |
title_short | Semi-automated ontology generation within OBO-Edit |
title_sort | semi-automated ontology generation within obo-edit |
topic | Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/ https://www.ncbi.nlm.nih.gov/pubmed/20529942 http://dx.doi.org/10.1093/bioinformatics/btq188 |
work_keys_str_mv | AT wachterthomas semiautomatedontologygenerationwithinoboedit AT schroedermichael semiautomatedontologygenerationwithinoboedit |