Cargando…
Sparse Convolutional Denoising Autoencoders for Genotype Imputation
Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (H...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769581/ https://www.ncbi.nlm.nih.gov/pubmed/31466333 http://dx.doi.org/10.3390/genes10090652 |
_version_ | 1783455270921306112 |
---|---|
author | Chen, Junjie Shi, Xinghua |
author_facet | Chen, Junjie Shi, Xinghua |
author_sort | Chen, Junjie |
collection | PubMed |
description | Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L(1) regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies. |
format | Online Article Text |
id | pubmed-6769581 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-67695812019-10-30 Sparse Convolutional Denoising Autoencoders for Genotype Imputation Chen, Junjie Shi, Xinghua Genes (Basel) Article Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L(1) regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies. MDPI 2019-08-28 /pmc/articles/PMC6769581/ /pubmed/31466333 http://dx.doi.org/10.3390/genes10090652 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Chen, Junjie Shi, Xinghua Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title | Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title_full | Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title_fullStr | Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title_full_unstemmed | Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title_short | Sparse Convolutional Denoising Autoencoders for Genotype Imputation |
title_sort | sparse convolutional denoising autoencoders for genotype imputation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769581/ https://www.ncbi.nlm.nih.gov/pubmed/31466333 http://dx.doi.org/10.3390/genes10090652 |
work_keys_str_mv | AT chenjunjie sparseconvolutionaldenoisingautoencodersforgenotypeimputation AT shixinghua sparseconvolutionaldenoisingautoencodersforgenotypeimputation |