Cargando…

Semi-automated ontology generation within OBO-Edit

Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it...

Descripción completa

Detalles Bibliográficos
Autores principales: Wächter, Thomas, Schroeder, Michael
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/
https://www.ncbi.nlm.nih.gov/pubmed/20529942
http://dx.doi.org/10.1093/bioinformatics/btq188
_version_ 1782182109163552768
author Wächter, Thomas
Schroeder, Michael
author_facet Wächter, Thomas
Schroeder, Michael
author_sort Wächter, Thomas
collection PubMed
description Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. Results: We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent–child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent–child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child–ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent–child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. Availability: DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org Contact: thomas.waechter@biotec.tu-dresden.de; Supplementary Information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2881373
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28813732010-06-08 Semi-automated ontology generation within OBO-Edit Wächter, Thomas Schroeder, Michael Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. Results: We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent–child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent–child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child–ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent–child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. Availability: DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org Contact: thomas.waechter@biotec.tu-dresden.de; Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881373/ /pubmed/20529942 http://dx.doi.org/10.1093/bioinformatics/btq188 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
Wächter, Thomas
Schroeder, Michael
Semi-automated ontology generation within OBO-Edit
title Semi-automated ontology generation within OBO-Edit
title_full Semi-automated ontology generation within OBO-Edit
title_fullStr Semi-automated ontology generation within OBO-Edit
title_full_unstemmed Semi-automated ontology generation within OBO-Edit
title_short Semi-automated ontology generation within OBO-Edit
title_sort semi-automated ontology generation within obo-edit
topic Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/
https://www.ncbi.nlm.nih.gov/pubmed/20529942
http://dx.doi.org/10.1093/bioinformatics/btq188
work_keys_str_mv AT wachterthomas semiautomatedontologygenerationwithinoboedit
AT schroedermichael semiautomatedontologygenerationwithinoboedit