Cargando…

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases

BACKGROUND: Domains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these dise...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Wangshu, Chen, Yong, Sun, Fengzhu, Jiang, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108930/
https://www.ncbi.nlm.nih.gov/pubmed/21504591
http://dx.doi.org/10.1186/1752-0509-5-55
_version_ 1782205387407097856
author Zhang, Wangshu
Chen, Yong
Sun, Fengzhu
Jiang, Rui
author_facet Zhang, Wangshu
Chen, Yong
Sun, Fengzhu
Jiang, Rui
author_sort Zhang, Wangshu
collection PubMed
description BACKGROUND: Domains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Within a given domain-domain interaction network, we make the assumption that similarities of disease phenotypes can be explained using proximities of domains associated with such diseases. Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases. RESULTS: Using a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource. CONCLUSIONS: The proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases.
format Online
Article
Text
id pubmed-3108930
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31089302011-06-07 DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases Zhang, Wangshu Chen, Yong Sun, Fengzhu Jiang, Rui BMC Syst Biol Research Article BACKGROUND: Domains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Within a given domain-domain interaction network, we make the assumption that similarities of disease phenotypes can be explained using proximities of domains associated with such diseases. Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases. RESULTS: Using a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource. CONCLUSIONS: The proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases. BioMed Central 2011-04-19 /pmc/articles/PMC3108930/ /pubmed/21504591 http://dx.doi.org/10.1186/1752-0509-5-55 Text en Copyright ©2011 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Wangshu
Chen, Yong
Sun, Fengzhu
Jiang, Rui
DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title_full DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title_fullStr DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title_full_unstemmed DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title_short DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
title_sort domainrbf: a bayesian regression approach to the prioritization of candidate domains for complex diseases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108930/
https://www.ncbi.nlm.nih.gov/pubmed/21504591
http://dx.doi.org/10.1186/1752-0509-5-55
work_keys_str_mv AT zhangwangshu domainrbfabayesianregressionapproachtotheprioritizationofcandidatedomainsforcomplexdiseases
AT chenyong domainrbfabayesianregressionapproachtotheprioritizationofcandidatedomainsforcomplexdiseases
AT sunfengzhu domainrbfabayesianregressionapproachtotheprioritizationofcandidatedomainsforcomplexdiseases
AT jiangrui domainrbfabayesianregressionapproachtotheprioritizationofcandidatedomainsforcomplexdiseases