Cargando…

Prediction of gene expression with cis-SNPs using mixed models and regularization methods

BACKGROUND: It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further inte...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Ping, Zhou, Xiang, Huang, Shuiping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425981/
https://www.ncbi.nlm.nih.gov/pubmed/28490319
http://dx.doi.org/10.1186/s12864-017-3759-6
_version_ 1783235376145498112
author Zeng, Ping
Zhou, Xiang
Huang, Shuiping
author_facet Zeng, Ping
Zhou, Xiang
Huang, Shuiping
author_sort Zeng, Ping
collection PubMed
description BACKGROUND: It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. METHODS: We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. RESULTS: The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R (2) of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R (2) ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R (2) ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. CONCLUSIONS: Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs.
format Online
Article
Text
id pubmed-5425981
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-54259812017-05-12 Prediction of gene expression with cis-SNPs using mixed models and regularization methods Zeng, Ping Zhou, Xiang Huang, Shuiping BMC Genomics Research Article BACKGROUND: It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. METHODS: We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. RESULTS: The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R (2) of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R (2) ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R (2) ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. CONCLUSIONS: Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs. BioMed Central 2017-05-11 /pmc/articles/PMC5425981/ /pubmed/28490319 http://dx.doi.org/10.1186/s12864-017-3759-6 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zeng, Ping
Zhou, Xiang
Huang, Shuiping
Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title_full Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title_fullStr Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title_full_unstemmed Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title_short Prediction of gene expression with cis-SNPs using mixed models and regularization methods
title_sort prediction of gene expression with cis-snps using mixed models and regularization methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425981/
https://www.ncbi.nlm.nih.gov/pubmed/28490319
http://dx.doi.org/10.1186/s12864-017-3759-6
work_keys_str_mv AT zengping predictionofgeneexpressionwithcissnpsusingmixedmodelsandregularizationmethods
AT zhouxiang predictionofgeneexpressionwithcissnpsusingmixedmodelsandregularizationmethods
AT huangshuiping predictionofgeneexpressionwithcissnpsusingmixedmodelsandregularizationmethods