Cargando…

FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies

BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology...

Descripción completa

Detalles Bibliográficos
Autores principales: Xing, Guangming, Zhang, Guo-Qiang, Cui, Licong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5057496/
https://www.ncbi.nlm.nih.gov/pubmed/27777627
http://dx.doi.org/10.1186/s13040-016-0110-8
_version_ 1782459081872637952
author Xing, Guangming
Zhang, Guo-Qiang
Cui, Licong
author_facet Xing, Guangming
Zhang, Guo-Qiang
Cui, Licong
author_sort Xing, Guangming
collection PubMed
description BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology quality assurance process. Detecting and eliminating redundant relations would help improve the results of all methods relying on the relevant ontological systems as knowledge source, such as the computation of semantic distance between concepts and for ontology matching and alignment. RESULTS: This paper introduces a novel and scalable approach, called FEDRR – Fast, Exhaustive Detection of Redundant Relations – for quality assurance work during ontological evolution. FEDRR combines the algorithm ideas of Dynamic Programming with Topological Sort, for exhaustive mining of all redundant hierarchical relations in ontological hierarchies, in O(c·|V|+|E|) time, where |V| is the number of concepts, |E| is the number of the relations, and c is a constant in practice. Using FEDRR, we performed exhaustive search of all redundant is-a relations in two of the largest ontological systems in biomedicine: SNOMED CT and Gene Ontology (GO). 372 and 1609 redundant is-a relations were found in the 2015-09-01 version of SNOMED CT and 2015-05-01 version of GO, respectively. We have also performed FEDRR on over 190 source vocabularies in the UMLS - a large integrated repository of biomedical ontologies, and identified six sources containing redundant is-a relations. Randomly generated ontologies have also been used to further validate the efficiency of FEDRR. CONCLUSIONS: FEDRR provides a generally applicable, effective tool for systematic detecting redundant relations in large ontological systems for quality improvement.
format Online
Article
Text
id pubmed-5057496
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50574962016-10-24 FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies Xing, Guangming Zhang, Guo-Qiang Cui, Licong BioData Min Research BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology quality assurance process. Detecting and eliminating redundant relations would help improve the results of all methods relying on the relevant ontological systems as knowledge source, such as the computation of semantic distance between concepts and for ontology matching and alignment. RESULTS: This paper introduces a novel and scalable approach, called FEDRR – Fast, Exhaustive Detection of Redundant Relations – for quality assurance work during ontological evolution. FEDRR combines the algorithm ideas of Dynamic Programming with Topological Sort, for exhaustive mining of all redundant hierarchical relations in ontological hierarchies, in O(c·|V|+|E|) time, where |V| is the number of concepts, |E| is the number of the relations, and c is a constant in practice. Using FEDRR, we performed exhaustive search of all redundant is-a relations in two of the largest ontological systems in biomedicine: SNOMED CT and Gene Ontology (GO). 372 and 1609 redundant is-a relations were found in the 2015-09-01 version of SNOMED CT and 2015-05-01 version of GO, respectively. We have also performed FEDRR on over 190 source vocabularies in the UMLS - a large integrated repository of biomedical ontologies, and identified six sources containing redundant is-a relations. Randomly generated ontologies have also been used to further validate the efficiency of FEDRR. CONCLUSIONS: FEDRR provides a generally applicable, effective tool for systematic detecting redundant relations in large ontological systems for quality improvement. BioMed Central 2016-10-10 /pmc/articles/PMC5057496/ /pubmed/27777627 http://dx.doi.org/10.1186/s13040-016-0110-8 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Xing, Guangming
Zhang, Guo-Qiang
Cui, Licong
FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title_full FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title_fullStr FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title_full_unstemmed FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title_short FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
title_sort fedrr: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5057496/
https://www.ncbi.nlm.nih.gov/pubmed/27777627
http://dx.doi.org/10.1186/s13040-016-0110-8
work_keys_str_mv AT xingguangming fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies
AT zhangguoqiang fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies
AT cuilicong fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies