Cargando…

InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

BACKGROUND: Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the adva...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Liang, Jiang, Yue, Ju, Hong, Sun, Jie, Peng, Jiajie, Zhou, Meng, Hu, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780854/
https://www.ncbi.nlm.nih.gov/pubmed/29363423
http://dx.doi.org/10.1186/s12864-017-4338-6
_version_ 1783294822413500416
author Cheng, Liang
Jiang, Yue
Ju, Hong
Sun, Jie
Peng, Jiajie
Zhou, Meng
Hu, Yang
author_facet Cheng, Liang
Jiang, Yue
Ju, Hong
Sun, Jie
Peng, Jiajie
Zhou, Meng
Hu, Yang
author_sort Cheng, Liang
collection PubMed
description BACKGROUND: Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. RESULTS: We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. CONCLUSIONS: The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-017-4338-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5780854
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57808542018-02-06 InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk Cheng, Liang Jiang, Yue Ju, Hong Sun, Jie Peng, Jiajie Zhou, Meng Hu, Yang BMC Genomics Research BACKGROUND: Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. RESULTS: We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. CONCLUSIONS: The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-017-4338-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-01-19 /pmc/articles/PMC5780854/ /pubmed/29363423 http://dx.doi.org/10.1186/s12864-017-4338-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Cheng, Liang
Jiang, Yue
Ju, Hong
Sun, Jie
Peng, Jiajie
Zhou, Meng
Hu, Yang
InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title_full InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title_fullStr InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title_full_unstemmed InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title_short InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk
title_sort infacront: calculating cross-ontology term similarities using information flow by a random walk
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5780854/
https://www.ncbi.nlm.nih.gov/pubmed/29363423
http://dx.doi.org/10.1186/s12864-017-4338-6
work_keys_str_mv AT chengliang infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT jiangyue infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT juhong infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT sunjie infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT pengjiajie infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT zhoumeng infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk
AT huyang infacrontcalculatingcrossontologytermsimilaritiesusinginformationflowbyarandomwalk