Cargando…
FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies
BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5057496/ https://www.ncbi.nlm.nih.gov/pubmed/27777627 http://dx.doi.org/10.1186/s13040-016-0110-8 |
_version_ | 1782459081872637952 |
---|---|
author | Xing, Guangming Zhang, Guo-Qiang Cui, Licong |
author_facet | Xing, Guangming Zhang, Guo-Qiang Cui, Licong |
author_sort | Xing, Guangming |
collection | PubMed |
description | BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology quality assurance process. Detecting and eliminating redundant relations would help improve the results of all methods relying on the relevant ontological systems as knowledge source, such as the computation of semantic distance between concepts and for ontology matching and alignment. RESULTS: This paper introduces a novel and scalable approach, called FEDRR – Fast, Exhaustive Detection of Redundant Relations – for quality assurance work during ontological evolution. FEDRR combines the algorithm ideas of Dynamic Programming with Topological Sort, for exhaustive mining of all redundant hierarchical relations in ontological hierarchies, in O(c·|V|+|E|) time, where |V| is the number of concepts, |E| is the number of the relations, and c is a constant in practice. Using FEDRR, we performed exhaustive search of all redundant is-a relations in two of the largest ontological systems in biomedicine: SNOMED CT and Gene Ontology (GO). 372 and 1609 redundant is-a relations were found in the 2015-09-01 version of SNOMED CT and 2015-05-01 version of GO, respectively. We have also performed FEDRR on over 190 source vocabularies in the UMLS - a large integrated repository of biomedical ontologies, and identified six sources containing redundant is-a relations. Randomly generated ontologies have also been used to further validate the efficiency of FEDRR. CONCLUSIONS: FEDRR provides a generally applicable, effective tool for systematic detecting redundant relations in large ontological systems for quality improvement. |
format | Online Article Text |
id | pubmed-5057496 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-50574962016-10-24 FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies Xing, Guangming Zhang, Guo-Qiang Cui, Licong BioData Min Research BACKGROUND: Redundant hierarchical relations refer to such patterns as two paths from one concept to another, one with length one (direct) and the other with length greater than one (indirect). Each redundant relation represents a possibly unintended defect that needs to be corrected in the ontology quality assurance process. Detecting and eliminating redundant relations would help improve the results of all methods relying on the relevant ontological systems as knowledge source, such as the computation of semantic distance between concepts and for ontology matching and alignment. RESULTS: This paper introduces a novel and scalable approach, called FEDRR – Fast, Exhaustive Detection of Redundant Relations – for quality assurance work during ontological evolution. FEDRR combines the algorithm ideas of Dynamic Programming with Topological Sort, for exhaustive mining of all redundant hierarchical relations in ontological hierarchies, in O(c·|V|+|E|) time, where |V| is the number of concepts, |E| is the number of the relations, and c is a constant in practice. Using FEDRR, we performed exhaustive search of all redundant is-a relations in two of the largest ontological systems in biomedicine: SNOMED CT and Gene Ontology (GO). 372 and 1609 redundant is-a relations were found in the 2015-09-01 version of SNOMED CT and 2015-05-01 version of GO, respectively. We have also performed FEDRR on over 190 source vocabularies in the UMLS - a large integrated repository of biomedical ontologies, and identified six sources containing redundant is-a relations. Randomly generated ontologies have also been used to further validate the efficiency of FEDRR. CONCLUSIONS: FEDRR provides a generally applicable, effective tool for systematic detecting redundant relations in large ontological systems for quality improvement. BioMed Central 2016-10-10 /pmc/articles/PMC5057496/ /pubmed/27777627 http://dx.doi.org/10.1186/s13040-016-0110-8 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Xing, Guangming Zhang, Guo-Qiang Cui, Licong FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title | FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title_full | FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title_fullStr | FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title_full_unstemmed | FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title_short | FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
title_sort | fedrr: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5057496/ https://www.ncbi.nlm.nih.gov/pubmed/27777627 http://dx.doi.org/10.1186/s13040-016-0110-8 |
work_keys_str_mv | AT xingguangming fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies AT zhangguoqiang fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies AT cuilicong fedrrfastexhaustivedetectionofredundanthierarchicalrelationsforqualityimprovementoflargebiomedicalontologies |