Cargando…

CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data

BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in sour...

Descripción completa

Detalles Bibliográficos
Autores principales: Nam, Yonghyun, Kim, Myungjun, Lee, Kyungwon, Shin, Hyunjung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959382/
https://www.ncbi.nlm.nih.gov/pubmed/27454118
http://dx.doi.org/10.1186/s12911-016-0315-2
_version_ 1782444395479433216
author Nam, Yonghyun
Kim, Myungjun
Lee, Kyungwon
Shin, Hyunjung
author_facet Nam, Yonghyun
Kim, Myungjun
Lee, Kyungwon
Shin, Hyunjung
author_sort Nam, Yonghyun
collection PubMed
description BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in source information leads to tighter connections between diseases in the network, for a certain group of diseases, such as metabolic diseases, the connections do not occur much due to insufficient source information; a large proportion of their associated genes are still unknown. One way to circumvent the difficulties in the lack of source information is to integrate available external information by using one of up-to-date integration or fusion methods. However, if one wants a disease network placing huge emphasis on the original source of data but still utilizing external sources only to complement it, integration may not be pertinent. Interpretation on the integrated network would be ambiguous: meanings conferred on edges would be vague due to fused information. METHODS: In this study, we propose a network based algorithm that complements the original network by utilizing external information while preserving the network’s originality. The proposed algorithm links the disconnected node to the disease network by using complementary information from external data source through four steps: anchoring, connecting, scoring, and stopping. RESULTS: When applied to the network of metabolic diseases that is sourced from protein-protein interaction data, the proposed algorithm recovered connections by 97%, and improved the AUC performance up to 0.71 (lifted from 0.55) by using the external information outsourced from text mining results on PubMed comorbidity literatures. Experimental results also show that the proposed algorithm is robust to noisy external information. CONCLUSION: This research has novelty in which the proposed algorithm preserves the network’s originality, but at the same time, complements it by utilizing external information. Furthermore it can be utilized for original association recovery and novel association discovery for disease network.
format Online
Article
Text
id pubmed-4959382
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49593822016-08-02 CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data Nam, Yonghyun Kim, Myungjun Lee, Kyungwon Shin, Hyunjung BMC Med Inform Decis Mak Research BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in source information leads to tighter connections between diseases in the network, for a certain group of diseases, such as metabolic diseases, the connections do not occur much due to insufficient source information; a large proportion of their associated genes are still unknown. One way to circumvent the difficulties in the lack of source information is to integrate available external information by using one of up-to-date integration or fusion methods. However, if one wants a disease network placing huge emphasis on the original source of data but still utilizing external sources only to complement it, integration may not be pertinent. Interpretation on the integrated network would be ambiguous: meanings conferred on edges would be vague due to fused information. METHODS: In this study, we propose a network based algorithm that complements the original network by utilizing external information while preserving the network’s originality. The proposed algorithm links the disconnected node to the disease network by using complementary information from external data source through four steps: anchoring, connecting, scoring, and stopping. RESULTS: When applied to the network of metabolic diseases that is sourced from protein-protein interaction data, the proposed algorithm recovered connections by 97%, and improved the AUC performance up to 0.71 (lifted from 0.55) by using the external information outsourced from text mining results on PubMed comorbidity literatures. Experimental results also show that the proposed algorithm is robust to noisy external information. CONCLUSION: This research has novelty in which the proposed algorithm preserves the network’s originality, but at the same time, complements it by utilizing external information. Furthermore it can be utilized for original association recovery and novel association discovery for disease network. BioMed Central 2016-07-25 /pmc/articles/PMC4959382/ /pubmed/27454118 http://dx.doi.org/10.1186/s12911-016-0315-2 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Nam, Yonghyun
Kim, Myungjun
Lee, Kyungwon
Shin, Hyunjung
CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title_full CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title_fullStr CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title_full_unstemmed CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title_short CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
title_sort clash: complementary linkage with anchoring and scoring for heterogeneous biomolecular and clinical data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959382/
https://www.ncbi.nlm.nih.gov/pubmed/27454118
http://dx.doi.org/10.1186/s12911-016-0315-2
work_keys_str_mv AT namyonghyun clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata
AT kimmyungjun clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata
AT leekyungwon clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata
AT shinhyunjung clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata