Cargando…
Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, cla...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3697978/ https://www.ncbi.nlm.nih.gov/pubmed/23636737 http://dx.doi.org/10.1534/genetics.113.151217 |
_version_ | 1782275216374759424 |
---|---|
author | Runcie, Daniel E. Mukherjee, Sayan |
author_facet | Runcie, Daniel E. Mukherjee, Sayan |
author_sort | Runcie, Daniel E. |
collection | PubMed |
description | Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set. |
format | Online Article Text |
id | pubmed-3697978 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-36979782013-07-02 Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices Runcie, Daniel E. Mukherjee, Sayan Genetics Investigations Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set. Genetics Society of America 2013-07 /pmc/articles/PMC3697978/ /pubmed/23636737 http://dx.doi.org/10.1534/genetics.113.151217 Text en Copyright © 2013 by the Genetics Society of America Available freely online through the author-supported open access option. |
spellingShingle | Investigations Runcie, Daniel E. Mukherjee, Sayan Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title | Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title_full | Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title_fullStr | Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title_full_unstemmed | Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title_short | Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices |
title_sort | dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices |
topic | Investigations |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3697978/ https://www.ncbi.nlm.nih.gov/pubmed/23636737 http://dx.doi.org/10.1534/genetics.113.151217 |
work_keys_str_mv | AT runciedaniele dissectinghighdimensionalphenotypeswithbayesiansparsefactoranalysisofgeneticcovariancematrices AT mukherjeesayan dissectinghighdimensionalphenotypeswithbayesiansparsefactoranalysisofgeneticcovariancematrices |