Cargando…

Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships

BACKGROUND: Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domai...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wangshu, Coba, Marcelo P., Sun, Fengzhu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895779/
https://www.ncbi.nlm.nih.gov/pubmed/26818594
http://dx.doi.org/10.1186/s12918-015-0247-y
_version_ 1782435922152783872
author Zhang, Wangshu
Coba, Marcelo P.
Sun, Fengzhu
author_facet Zhang, Wangshu
Coba, Marcelo P.
Sun, Fengzhu
author_sort Zhang, Wangshu
collection PubMed
description BACKGROUND: Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domain-disease relationships is rarely available. Thus, identification of domains associated with human diseases would greatly improve our understandingof the mechanism of human complex diseases and further improve the prevention, diagnosis and treatment of these diseases. METHODS: Based on phenotypic similarities among diseases, we first group diseases into overlapping modules. We then develop a framework to infer associations between domains and diseases through known relationships between diseases and modules, domains and proteins, as well as proteins and disease modules. Different methods including Association, Maximum likelihood estimation (MLE), Domain-disease pair exclusion analysis (DPEA), Bayesian, and Parsimonious explanation (PE) approaches are developed to predict domain-disease associations. RESULTS: We demonstrate the effectiveness of all the five approaches via a series of validation experiments, and show the robustness of the MLE, Bayesian and PE approaches to the involved parameters. We also study the effects of disease modularization in inferring novel domain-disease associations. Through validation, the AUC (Area Under the operating characteristic Curve) scores for Bayesian, MLE, DPEA, PE, and Association approaches are 0.86, 0.84, 0.83, 0.83 and 0.79, respectively, indicating the usefulness of these approaches for predicting domain-disease relationships. Finally, we choose the Bayesian approach to infer domains associated with two common diseases, Crohn’s disease and type 2 diabetes. CONCLUSIONS: The Bayesian approach has the best performance for the inference of domain-disease relationships. The predicted landscape between domains and diseases provides a more detailed view about the disease mechanisms. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-015-0247-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4895779
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48957792016-06-10 Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships Zhang, Wangshu Coba, Marcelo P. Sun, Fengzhu BMC Syst Biol Proceedings BACKGROUND: Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domain-disease relationships is rarely available. Thus, identification of domains associated with human diseases would greatly improve our understandingof the mechanism of human complex diseases and further improve the prevention, diagnosis and treatment of these diseases. METHODS: Based on phenotypic similarities among diseases, we first group diseases into overlapping modules. We then develop a framework to infer associations between domains and diseases through known relationships between diseases and modules, domains and proteins, as well as proteins and disease modules. Different methods including Association, Maximum likelihood estimation (MLE), Domain-disease pair exclusion analysis (DPEA), Bayesian, and Parsimonious explanation (PE) approaches are developed to predict domain-disease associations. RESULTS: We demonstrate the effectiveness of all the five approaches via a series of validation experiments, and show the robustness of the MLE, Bayesian and PE approaches to the involved parameters. We also study the effects of disease modularization in inferring novel domain-disease associations. Through validation, the AUC (Area Under the operating characteristic Curve) scores for Bayesian, MLE, DPEA, PE, and Association approaches are 0.86, 0.84, 0.83, 0.83 and 0.79, respectively, indicating the usefulness of these approaches for predicting domain-disease relationships. Finally, we choose the Bayesian approach to infer domains associated with two common diseases, Crohn’s disease and type 2 diabetes. CONCLUSIONS: The Bayesian approach has the best performance for the inference of domain-disease relationships. The predicted landscape between domains and diseases provides a more detailed view about the disease mechanisms. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-015-0247-y) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-11 /pmc/articles/PMC4895779/ /pubmed/26818594 http://dx.doi.org/10.1186/s12918-015-0247-y Text en © Zhang et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Zhang, Wangshu
Coba, Marcelo P.
Sun, Fengzhu
Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title_full Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title_fullStr Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title_full_unstemmed Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title_short Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
title_sort inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895779/
https://www.ncbi.nlm.nih.gov/pubmed/26818594
http://dx.doi.org/10.1186/s12918-015-0247-y
work_keys_str_mv AT zhangwangshu inferenceofdomaindiseaseassociationsfromdomainproteinproteindiseaseanddiseasediseaserelationships
AT cobamarcelop inferenceofdomaindiseaseassociationsfromdomainproteinproteindiseaseanddiseasediseaserelationships
AT sunfengzhu inferenceofdomaindiseaseassociationsfromdomainproteinproteindiseaseanddiseasediseaserelationships