Cargando…

Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction

Genome-wide association studies (GWAS) have identified numerous associations between genetic loci and individual phenotypes; however, relatively few GWAS have attempted to detect pleiotropic associations, in which loci are simultaneously associated with multiple distinct phenotypes. We show that ple...

Descripción completa

Detalles Bibliográficos
Autores principales: Hartley, Stephen W., Monti, Stefano, Liu, Ching-Ti, Steinberg, Martin H., Sebastiani, Paola
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Research Foundation 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3438684/
https://www.ncbi.nlm.nih.gov/pubmed/22973300
http://dx.doi.org/10.3389/fgene.2012.00176
_version_ 1782242925258735616
author Hartley, Stephen W.
Monti, Stefano
Liu, Ching-Ti
Steinberg, Martin H.
Sebastiani, Paola
author_facet Hartley, Stephen W.
Monti, Stefano
Liu, Ching-Ti
Steinberg, Martin H.
Sebastiani, Paola
author_sort Hartley, Stephen W.
collection PubMed
description Genome-wide association studies (GWAS) have identified numerous associations between genetic loci and individual phenotypes; however, relatively few GWAS have attempted to detect pleiotropic associations, in which loci are simultaneously associated with multiple distinct phenotypes. We show that pleiotropic associations can be directly modeled via the construction of simple Bayesian networks, and that these models can be applied to produce single or ensembles of Bayesian classifiers that leverage pleiotropy to improve genetic risk prediction. The proposed method includes two phases: (1) Bayesian model comparison, to identify Single-Nucleotide Polymorphisms (SNPs) associated with one or more traits; and (2) cross-validation feature selection, in which a final set of SNPs is selected to optimize prediction. To demonstrate the capabilities and limitations of the method, a total of 1600 case-control GWAS datasets with two dichotomous phenotypes were simulated under 16 scenarios, varying the association strengths of causal SNPs, the size of the discovery sets, the balance between cases and controls, and the number of pleiotropic causal SNPs. Across the 16 scenarios, prediction accuracy varied from 90 to 50%. In the 14 scenarios that included pleiotropically associated SNPs, the pleiotropic model search and prediction methods consistently outperformed the naive model search and prediction. In the two scenarios in which there were no true pleiotropic SNPs, the differences between the pleiotropic and naive model searches were minimal. To further evaluate the method on real data, a discovery set of 1071 sickle cell disease (SCD) patients was used to search for pleiotropic associations between cerebral vascular accidents and fetal hemoglobin level. Classification was performed on a smaller validation set of 352 SCD patients, and showed that the inclusion of pleiotropic SNPs may slightly improve prediction, although the difference was not statistically significant. The proposed method is robust, computationally efficient, and provides a powerful new approach for detecting and modeling pleiotropic disease loci.
format Online
Article
Text
id pubmed-3438684
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Frontiers Research Foundation
record_format MEDLINE/PubMed
spelling pubmed-34386842012-09-12 Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction Hartley, Stephen W. Monti, Stefano Liu, Ching-Ti Steinberg, Martin H. Sebastiani, Paola Front Genet Genetics Genome-wide association studies (GWAS) have identified numerous associations between genetic loci and individual phenotypes; however, relatively few GWAS have attempted to detect pleiotropic associations, in which loci are simultaneously associated with multiple distinct phenotypes. We show that pleiotropic associations can be directly modeled via the construction of simple Bayesian networks, and that these models can be applied to produce single or ensembles of Bayesian classifiers that leverage pleiotropy to improve genetic risk prediction. The proposed method includes two phases: (1) Bayesian model comparison, to identify Single-Nucleotide Polymorphisms (SNPs) associated with one or more traits; and (2) cross-validation feature selection, in which a final set of SNPs is selected to optimize prediction. To demonstrate the capabilities and limitations of the method, a total of 1600 case-control GWAS datasets with two dichotomous phenotypes were simulated under 16 scenarios, varying the association strengths of causal SNPs, the size of the discovery sets, the balance between cases and controls, and the number of pleiotropic causal SNPs. Across the 16 scenarios, prediction accuracy varied from 90 to 50%. In the 14 scenarios that included pleiotropically associated SNPs, the pleiotropic model search and prediction methods consistently outperformed the naive model search and prediction. In the two scenarios in which there were no true pleiotropic SNPs, the differences between the pleiotropic and naive model searches were minimal. To further evaluate the method on real data, a discovery set of 1071 sickle cell disease (SCD) patients was used to search for pleiotropic associations between cerebral vascular accidents and fetal hemoglobin level. Classification was performed on a smaller validation set of 352 SCD patients, and showed that the inclusion of pleiotropic SNPs may slightly improve prediction, although the difference was not statistically significant. The proposed method is robust, computationally efficient, and provides a powerful new approach for detecting and modeling pleiotropic disease loci. Frontiers Research Foundation 2012-09-11 /pmc/articles/PMC3438684/ /pubmed/22973300 http://dx.doi.org/10.3389/fgene.2012.00176 Text en Copyright © 2012 Hartley, Monti, Liu, Steinberg and Sebastiani. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle Genetics
Hartley, Stephen W.
Monti, Stefano
Liu, Ching-Ti
Steinberg, Martin H.
Sebastiani, Paola
Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title_full Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title_fullStr Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title_full_unstemmed Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title_short Bayesian Methods for Multivariate Modeling of Pleiotropic SNP Associations and Genetic Risk Prediction
title_sort bayesian methods for multivariate modeling of pleiotropic snp associations and genetic risk prediction
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3438684/
https://www.ncbi.nlm.nih.gov/pubmed/22973300
http://dx.doi.org/10.3389/fgene.2012.00176
work_keys_str_mv AT hartleystephenw bayesianmethodsformultivariatemodelingofpleiotropicsnpassociationsandgeneticriskprediction
AT montistefano bayesianmethodsformultivariatemodelingofpleiotropicsnpassociationsandgeneticriskprediction
AT liuchingti bayesianmethodsformultivariatemodelingofpleiotropicsnpassociationsandgeneticriskprediction
AT steinbergmartinh bayesianmethodsformultivariatemodelingofpleiotropicsnpassociationsandgeneticriskprediction
AT sebastianipaola bayesianmethodsformultivariatemodelingofpleiotropicsnpassociationsandgeneticriskprediction