Cargando…
A deep auto-encoder model for gene expression prediction
BACKGROUND: Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess ho...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5773895/ https://www.ncbi.nlm.nih.gov/pubmed/29219072 http://dx.doi.org/10.1186/s12864-017-4226-0 |
_version_ | 1783293659595145216 |
---|---|
author | Xie, Rui Wen, Jia Quitadamo, Andrew Cheng, Jianlin Shi, Xinghua |
author_facet | Xie, Rui Wen, Jia Quitadamo, Andrew Cheng, Jianlin Shi, Xinghua |
author_sort | Xie, Rui |
collection | PubMed |
description | BACKGROUND: Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. RESULTS: To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. CONCLUSION: We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes’ contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics. |
format | Online Article Text |
id | pubmed-5773895 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57738952018-01-26 A deep auto-encoder model for gene expression prediction Xie, Rui Wen, Jia Quitadamo, Andrew Cheng, Jianlin Shi, Xinghua BMC Genomics Research BACKGROUND: Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. RESULTS: To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. CONCLUSION: We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes’ contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics. BioMed Central 2017-11-17 /pmc/articles/PMC5773895/ /pubmed/29219072 http://dx.doi.org/10.1186/s12864-017-4226-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Xie, Rui Wen, Jia Quitadamo, Andrew Cheng, Jianlin Shi, Xinghua A deep auto-encoder model for gene expression prediction |
title | A deep auto-encoder model for gene expression prediction |
title_full | A deep auto-encoder model for gene expression prediction |
title_fullStr | A deep auto-encoder model for gene expression prediction |
title_full_unstemmed | A deep auto-encoder model for gene expression prediction |
title_short | A deep auto-encoder model for gene expression prediction |
title_sort | deep auto-encoder model for gene expression prediction |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5773895/ https://www.ncbi.nlm.nih.gov/pubmed/29219072 http://dx.doi.org/10.1186/s12864-017-4226-0 |
work_keys_str_mv | AT xierui adeepautoencodermodelforgeneexpressionprediction AT wenjia adeepautoencodermodelforgeneexpressionprediction AT quitadamoandrew adeepautoencodermodelforgeneexpressionprediction AT chengjianlin adeepautoencodermodelforgeneexpressionprediction AT shixinghua adeepautoencodermodelforgeneexpressionprediction AT xierui deepautoencodermodelforgeneexpressionprediction AT wenjia deepautoencodermodelforgeneexpressionprediction AT quitadamoandrew deepautoencodermodelforgeneexpressionprediction AT chengjianlin deepautoencodermodelforgeneexpressionprediction AT shixinghua deepautoencodermodelforgeneexpressionprediction |