Cargando…

Sparse Convolutional Denoising Autoencoders for Genotype Imputation

Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (H...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Junjie, Shi, Xinghua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769581/
https://www.ncbi.nlm.nih.gov/pubmed/31466333
http://dx.doi.org/10.3390/genes10090652
_version_ 1783455270921306112
author Chen, Junjie
Shi, Xinghua
author_facet Chen, Junjie
Shi, Xinghua
author_sort Chen, Junjie
collection PubMed
description Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L(1) regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies.
format Online
Article
Text
id pubmed-6769581
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-67695812019-10-30 Sparse Convolutional Denoising Autoencoders for Genotype Imputation Chen, Junjie Shi, Xinghua Genes (Basel) Article Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L(1) regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies. MDPI 2019-08-28 /pmc/articles/PMC6769581/ /pubmed/31466333 http://dx.doi.org/10.3390/genes10090652 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Junjie
Shi, Xinghua
Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title_full Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title_fullStr Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title_full_unstemmed Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title_short Sparse Convolutional Denoising Autoencoders for Genotype Imputation
title_sort sparse convolutional denoising autoencoders for genotype imputation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6769581/
https://www.ncbi.nlm.nih.gov/pubmed/31466333
http://dx.doi.org/10.3390/genes10090652
work_keys_str_mv AT chenjunjie sparseconvolutionaldenoisingautoencodersforgenotypeimputation
AT shixinghua sparseconvolutionaldenoisingautoencodersforgenotypeimputation