Cargando…
CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data
BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in sour...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959382/ https://www.ncbi.nlm.nih.gov/pubmed/27454118 http://dx.doi.org/10.1186/s12911-016-0315-2 |
_version_ | 1782444395479433216 |
---|---|
author | Nam, Yonghyun Kim, Myungjun Lee, Kyungwon Shin, Hyunjung |
author_facet | Nam, Yonghyun Kim, Myungjun Lee, Kyungwon Shin, Hyunjung |
author_sort | Nam, Yonghyun |
collection | PubMed |
description | BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in source information leads to tighter connections between diseases in the network, for a certain group of diseases, such as metabolic diseases, the connections do not occur much due to insufficient source information; a large proportion of their associated genes are still unknown. One way to circumvent the difficulties in the lack of source information is to integrate available external information by using one of up-to-date integration or fusion methods. However, if one wants a disease network placing huge emphasis on the original source of data but still utilizing external sources only to complement it, integration may not be pertinent. Interpretation on the integrated network would be ambiguous: meanings conferred on edges would be vague due to fused information. METHODS: In this study, we propose a network based algorithm that complements the original network by utilizing external information while preserving the network’s originality. The proposed algorithm links the disconnected node to the disease network by using complementary information from external data source through four steps: anchoring, connecting, scoring, and stopping. RESULTS: When applied to the network of metabolic diseases that is sourced from protein-protein interaction data, the proposed algorithm recovered connections by 97%, and improved the AUC performance up to 0.71 (lifted from 0.55) by using the external information outsourced from text mining results on PubMed comorbidity literatures. Experimental results also show that the proposed algorithm is robust to noisy external information. CONCLUSION: This research has novelty in which the proposed algorithm preserves the network’s originality, but at the same time, complements it by utilizing external information. Furthermore it can be utilized for original association recovery and novel association discovery for disease network. |
format | Online Article Text |
id | pubmed-4959382 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49593822016-08-02 CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data Nam, Yonghyun Kim, Myungjun Lee, Kyungwon Shin, Hyunjung BMC Med Inform Decis Mak Research BACKGROUND: The study on disease-disease association has been increasingly viewed and analyzed as a network, in which the connections between diseases are configured using the source information on interactome maps of biomolecules such as genes, proteins, metabolites, etc. Although abundance in source information leads to tighter connections between diseases in the network, for a certain group of diseases, such as metabolic diseases, the connections do not occur much due to insufficient source information; a large proportion of their associated genes are still unknown. One way to circumvent the difficulties in the lack of source information is to integrate available external information by using one of up-to-date integration or fusion methods. However, if one wants a disease network placing huge emphasis on the original source of data but still utilizing external sources only to complement it, integration may not be pertinent. Interpretation on the integrated network would be ambiguous: meanings conferred on edges would be vague due to fused information. METHODS: In this study, we propose a network based algorithm that complements the original network by utilizing external information while preserving the network’s originality. The proposed algorithm links the disconnected node to the disease network by using complementary information from external data source through four steps: anchoring, connecting, scoring, and stopping. RESULTS: When applied to the network of metabolic diseases that is sourced from protein-protein interaction data, the proposed algorithm recovered connections by 97%, and improved the AUC performance up to 0.71 (lifted from 0.55) by using the external information outsourced from text mining results on PubMed comorbidity literatures. Experimental results also show that the proposed algorithm is robust to noisy external information. CONCLUSION: This research has novelty in which the proposed algorithm preserves the network’s originality, but at the same time, complements it by utilizing external information. Furthermore it can be utilized for original association recovery and novel association discovery for disease network. BioMed Central 2016-07-25 /pmc/articles/PMC4959382/ /pubmed/27454118 http://dx.doi.org/10.1186/s12911-016-0315-2 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Nam, Yonghyun Kim, Myungjun Lee, Kyungwon Shin, Hyunjung CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title | CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title_full | CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title_fullStr | CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title_full_unstemmed | CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title_short | CLASH: Complementary Linkage with Anchoring and Scoring for Heterogeneous biomolecular and clinical data |
title_sort | clash: complementary linkage with anchoring and scoring for heterogeneous biomolecular and clinical data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959382/ https://www.ncbi.nlm.nih.gov/pubmed/27454118 http://dx.doi.org/10.1186/s12911-016-0315-2 |
work_keys_str_mv | AT namyonghyun clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata AT kimmyungjun clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata AT leekyungwon clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata AT shinhyunjung clashcomplementarylinkagewithanchoringandscoringforheterogeneousbiomolecularandclinicaldata |