Cargando…

Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices

Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, cla...

Descripción completa

Detalles Bibliográficos
Autores principales: Runcie, Daniel E., Mukherjee, Sayan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3697978/
https://www.ncbi.nlm.nih.gov/pubmed/23636737
http://dx.doi.org/10.1534/genetics.113.151217
_version_ 1782275216374759424
author Runcie, Daniel E.
Mukherjee, Sayan
author_facet Runcie, Daniel E.
Mukherjee, Sayan
author_sort Runcie, Daniel E.
collection PubMed
description Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.
format Online
Article
Text
id pubmed-3697978
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-36979782013-07-02 Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices Runcie, Daniel E. Mukherjee, Sayan Genetics Investigations Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set. Genetics Society of America 2013-07 /pmc/articles/PMC3697978/ /pubmed/23636737 http://dx.doi.org/10.1534/genetics.113.151217 Text en Copyright © 2013 by the Genetics Society of America Available freely online through the author-supported open access option.
spellingShingle Investigations
Runcie, Daniel E.
Mukherjee, Sayan
Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title_full Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title_fullStr Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title_full_unstemmed Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title_short Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
title_sort dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3697978/
https://www.ncbi.nlm.nih.gov/pubmed/23636737
http://dx.doi.org/10.1534/genetics.113.151217
work_keys_str_mv AT runciedaniele dissectinghighdimensionalphenotypeswithbayesiansparsefactoranalysisofgeneticcovariancematrices
AT mukherjeesayan dissectinghighdimensionalphenotypeswithbayesiansparsefactoranalysisofgeneticcovariancematrices