Cargando…

Prediction of comorbid diseases using weighted geometric embedding of human interactome

BACKGROUND: Comorbidity is the phenomenon of two or more diseases occurring simultaneously not by random chance and presents great challenges to accurate diagnosis and treatment. As an effort toward better understanding the genetic causes of comorbidity, in this work, we have developed a computation...

Descripción completa

Detalles Bibliográficos
Autores principales: Akram, Pakeeza, Liao, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936100/
https://www.ncbi.nlm.nih.gov/pubmed/31888634
http://dx.doi.org/10.1186/s12920-019-0605-5
_version_ 1783483684413767680
author Akram, Pakeeza
Liao, Li
author_facet Akram, Pakeeza
Liao, Li
author_sort Akram, Pakeeza
collection PubMed
description BACKGROUND: Comorbidity is the phenomenon of two or more diseases occurring simultaneously not by random chance and presents great challenges to accurate diagnosis and treatment. As an effort toward better understanding the genetic causes of comorbidity, in this work, we have developed a computational method to predict comorbid diseases. Two diseases sharing common genes tend to increase their comorbidity. Previous work shows that after mapping the associated genes onto the human interactome the distance between the two disease modules (subgraphs) is correlated with comorbidity. METHODS: To fully incorporate structural characteristics of interactome as features into prediction of comorbidity, our method embeds the human interactome into a high dimensional geometric space with weights assigned to the network edges and uses the projection onto different dimension to “fingerprint” disease modules. A supervised machine learning classifier is then trained to discriminate comorbid diseases versus non-comorbid diseases. RESULTS: In cross-validation using a benchmark dataset of more than 10,000 disease pairs, we report that our model achieves remarkable performance of ROC score = 0.90 for comorbidity threshold at relative risk RR = 0 and 0.76 for comorbidity threshold at RR = 1, and significantly outperforms the previous method and the interactome generated by annotated data. To further incorporate prior knowledge pathways association with diseases, we weight the protein-protein interaction network edges according to their frequency of occurring in those pathways in such a way that edges with higher frequency will more likely be selected in the minimum spanning tree for geometric embedding. Such weighted embedding is shown to lead to further improvement of comorbid disease prediction. CONCLUSION: The work demonstrates that embedding the two-dimension planar graph of human interactome into a high dimensional geometric space allows for characterizing and capturing disease modules (subgraphs formed by the disease associated genes) from multiple perspectives, and hence provides enriched features for a supervised classifier to discriminate comorbid disease pairs from non-comorbid disease pairs more accurately than based on simply the module separation.
format Online
Article
Text
id pubmed-6936100
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69361002019-12-31 Prediction of comorbid diseases using weighted geometric embedding of human interactome Akram, Pakeeza Liao, Li BMC Med Genomics Research BACKGROUND: Comorbidity is the phenomenon of two or more diseases occurring simultaneously not by random chance and presents great challenges to accurate diagnosis and treatment. As an effort toward better understanding the genetic causes of comorbidity, in this work, we have developed a computational method to predict comorbid diseases. Two diseases sharing common genes tend to increase their comorbidity. Previous work shows that after mapping the associated genes onto the human interactome the distance between the two disease modules (subgraphs) is correlated with comorbidity. METHODS: To fully incorporate structural characteristics of interactome as features into prediction of comorbidity, our method embeds the human interactome into a high dimensional geometric space with weights assigned to the network edges and uses the projection onto different dimension to “fingerprint” disease modules. A supervised machine learning classifier is then trained to discriminate comorbid diseases versus non-comorbid diseases. RESULTS: In cross-validation using a benchmark dataset of more than 10,000 disease pairs, we report that our model achieves remarkable performance of ROC score = 0.90 for comorbidity threshold at relative risk RR = 0 and 0.76 for comorbidity threshold at RR = 1, and significantly outperforms the previous method and the interactome generated by annotated data. To further incorporate prior knowledge pathways association with diseases, we weight the protein-protein interaction network edges according to their frequency of occurring in those pathways in such a way that edges with higher frequency will more likely be selected in the minimum spanning tree for geometric embedding. Such weighted embedding is shown to lead to further improvement of comorbid disease prediction. CONCLUSION: The work demonstrates that embedding the two-dimension planar graph of human interactome into a high dimensional geometric space allows for characterizing and capturing disease modules (subgraphs formed by the disease associated genes) from multiple perspectives, and hence provides enriched features for a supervised classifier to discriminate comorbid disease pairs from non-comorbid disease pairs more accurately than based on simply the module separation. BioMed Central 2019-12-30 /pmc/articles/PMC6936100/ /pubmed/31888634 http://dx.doi.org/10.1186/s12920-019-0605-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Akram, Pakeeza
Liao, Li
Prediction of comorbid diseases using weighted geometric embedding of human interactome
title Prediction of comorbid diseases using weighted geometric embedding of human interactome
title_full Prediction of comorbid diseases using weighted geometric embedding of human interactome
title_fullStr Prediction of comorbid diseases using weighted geometric embedding of human interactome
title_full_unstemmed Prediction of comorbid diseases using weighted geometric embedding of human interactome
title_short Prediction of comorbid diseases using weighted geometric embedding of human interactome
title_sort prediction of comorbid diseases using weighted geometric embedding of human interactome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936100/
https://www.ncbi.nlm.nih.gov/pubmed/31888634
http://dx.doi.org/10.1186/s12920-019-0605-5
work_keys_str_mv AT akrampakeeza predictionofcomorbiddiseasesusingweightedgeometricembeddingofhumaninteractome
AT liaoli predictionofcomorbiddiseasesusingweightedgeometricembeddingofhumaninteractome