Cargando…

Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits

BACKGROUND: Dense SNP genotypes are often combined with complex trait phenotypes to map causal variants, study genetic architecture and provide genomic predictions for individuals with genotypes but no phenotype. A single method of analysis that jointly fits all genotypes in a Bayesian mixture model...

Descripción completa

Detalles Bibliográficos
Autores principales: MacLeod, I. M., Bowman, P. J., Vander Jagt, C. J., Haile-Mariam, M., Kemper, K. E., Chamberlain, A. J., Schrooten, C., Hayes, B. J., Goddard, M. E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4769584/
https://www.ncbi.nlm.nih.gov/pubmed/26920147
http://dx.doi.org/10.1186/s12864-016-2443-6
_version_ 1782418130182602752
author MacLeod, I. M.
Bowman, P. J.
Vander Jagt, C. J.
Haile-Mariam, M.
Kemper, K. E.
Chamberlain, A. J.
Schrooten, C.
Hayes, B. J.
Goddard, M. E.
author_facet MacLeod, I. M.
Bowman, P. J.
Vander Jagt, C. J.
Haile-Mariam, M.
Kemper, K. E.
Chamberlain, A. J.
Schrooten, C.
Hayes, B. J.
Goddard, M. E.
author_sort MacLeod, I. M.
collection PubMed
description BACKGROUND: Dense SNP genotypes are often combined with complex trait phenotypes to map causal variants, study genetic architecture and provide genomic predictions for individuals with genotypes but no phenotype. A single method of analysis that jointly fits all genotypes in a Bayesian mixture model (BayesR) has been shown to competitively address all 3 purposes simultaneously. However, BayesR and other similar methods ignore prior biological knowledge and assume all genotypes are equally likely to affect the trait. While this assumption is reasonable for SNP array genotypes, it is less sensible if genotypes are whole-genome sequence variants which should include causal variants. RESULTS: We introduce a new method (BayesRC) based on BayesR that incorporates prior biological information in the analysis by defining classes of variants likely to be enriched for causal mutations. The information can be derived from a range of sources, including variant annotation, candidate gene lists and known causal variants. This information is then incorporated objectively in the analysis based on evidence of enrichment in the data. We demonstrate the increased power of BayesRC compared to BayesR using real dairy cattle genotypes with simulated phenotypes. The genotypes were imputed whole-genome sequence variants in coding regions combined with dense SNP markers. BayesRC increased the power to detect causal variants and increased the accuracy of genomic prediction. The relative improvement for genomic prediction was most apparent in validation populations that were not closely related to the reference population. We also applied BayesRC to real milk production phenotypes in dairy cattle using independent biological priors from gene expression analyses. Although current biological knowledge of which genes and variants affect milk production is still very incomplete, our results suggest that the new BayesRC method was equal to or more powerful than BayesR for detecting candidate causal variants and for genomic prediction of milk traits. CONCLUSIONS: BayesRC provides a novel and flexible approach to simultaneously improving the accuracy of QTL discovery and genomic prediction by taking advantage of prior biological knowledge. Approaches such as BayesRC will become increasing useful as biological knowledge accumulates regarding functional regions of the genome for a range of traits and species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2443-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4769584
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47695842016-02-28 Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits MacLeod, I. M. Bowman, P. J. Vander Jagt, C. J. Haile-Mariam, M. Kemper, K. E. Chamberlain, A. J. Schrooten, C. Hayes, B. J. Goddard, M. E. BMC Genomics Methodology Article BACKGROUND: Dense SNP genotypes are often combined with complex trait phenotypes to map causal variants, study genetic architecture and provide genomic predictions for individuals with genotypes but no phenotype. A single method of analysis that jointly fits all genotypes in a Bayesian mixture model (BayesR) has been shown to competitively address all 3 purposes simultaneously. However, BayesR and other similar methods ignore prior biological knowledge and assume all genotypes are equally likely to affect the trait. While this assumption is reasonable for SNP array genotypes, it is less sensible if genotypes are whole-genome sequence variants which should include causal variants. RESULTS: We introduce a new method (BayesRC) based on BayesR that incorporates prior biological information in the analysis by defining classes of variants likely to be enriched for causal mutations. The information can be derived from a range of sources, including variant annotation, candidate gene lists and known causal variants. This information is then incorporated objectively in the analysis based on evidence of enrichment in the data. We demonstrate the increased power of BayesRC compared to BayesR using real dairy cattle genotypes with simulated phenotypes. The genotypes were imputed whole-genome sequence variants in coding regions combined with dense SNP markers. BayesRC increased the power to detect causal variants and increased the accuracy of genomic prediction. The relative improvement for genomic prediction was most apparent in validation populations that were not closely related to the reference population. We also applied BayesRC to real milk production phenotypes in dairy cattle using independent biological priors from gene expression analyses. Although current biological knowledge of which genes and variants affect milk production is still very incomplete, our results suggest that the new BayesRC method was equal to or more powerful than BayesR for detecting candidate causal variants and for genomic prediction of milk traits. CONCLUSIONS: BayesRC provides a novel and flexible approach to simultaneously improving the accuracy of QTL discovery and genomic prediction by taking advantage of prior biological knowledge. Approaches such as BayesRC will become increasing useful as biological knowledge accumulates regarding functional regions of the genome for a range of traits and species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2443-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-27 /pmc/articles/PMC4769584/ /pubmed/26920147 http://dx.doi.org/10.1186/s12864-016-2443-6 Text en © MacLeod et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
MacLeod, I. M.
Bowman, P. J.
Vander Jagt, C. J.
Haile-Mariam, M.
Kemper, K. E.
Chamberlain, A. J.
Schrooten, C.
Hayes, B. J.
Goddard, M. E.
Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title_full Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title_fullStr Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title_full_unstemmed Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title_short Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
title_sort exploiting biological priors and sequence variants enhances qtl discovery and genomic prediction of complex traits
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4769584/
https://www.ncbi.nlm.nih.gov/pubmed/26920147
http://dx.doi.org/10.1186/s12864-016-2443-6
work_keys_str_mv AT macleodim exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT bowmanpj exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT vanderjagtcj exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT hailemariamm exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT kemperke exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT chamberlainaj exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT schrootenc exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT hayesbj exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits
AT goddardme exploitingbiologicalpriorsandsequencevariantsenhancesqtldiscoveryandgenomicpredictionofcomplextraits