Cargando…
Rapid, Reference-Free human genotype imputation with denoising autoencoders
Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
eLife Sciences Publications, Ltd
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9555874/ https://www.ncbi.nlm.nih.gov/pubmed/36148981 http://dx.doi.org/10.7554/eLife.75600 |
_version_ | 1784806947859464192 |
---|---|
author | Dias, Raquel Evans, Doug Chen, Shang-Fu Chen, Kai-Yu Loguercio, Salvatore Chan, Leslie Torkamani, Ali |
author_facet | Dias, Raquel Evans, Doug Chen, Shang-Fu Chen, Kai-Yu Loguercio, Salvatore Chan, Leslie Torkamani, Ali |
author_sort | Dias, Raquel |
collection | PubMed |
description | Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools. |
format | Online Article Text |
id | pubmed-9555874 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | eLife Sciences Publications, Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-95558742022-10-13 Rapid, Reference-Free human genotype imputation with denoising autoencoders Dias, Raquel Evans, Doug Chen, Shang-Fu Chen, Kai-Yu Loguercio, Salvatore Chan, Leslie Torkamani, Ali eLife Computational and Systems Biology Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools. eLife Sciences Publications, Ltd 2022-09-23 /pmc/articles/PMC9555874/ /pubmed/36148981 http://dx.doi.org/10.7554/eLife.75600 Text en © 2022, Dias et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited. |
spellingShingle | Computational and Systems Biology Dias, Raquel Evans, Doug Chen, Shang-Fu Chen, Kai-Yu Loguercio, Salvatore Chan, Leslie Torkamani, Ali Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title | Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title_full | Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title_fullStr | Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title_full_unstemmed | Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title_short | Rapid, Reference-Free human genotype imputation with denoising autoencoders |
title_sort | rapid, reference-free human genotype imputation with denoising autoencoders |
topic | Computational and Systems Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9555874/ https://www.ncbi.nlm.nih.gov/pubmed/36148981 http://dx.doi.org/10.7554/eLife.75600 |
work_keys_str_mv | AT diasraquel rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT evansdoug rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT chenshangfu rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT chenkaiyu rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT loguerciosalvatore rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT chanleslie rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders AT torkamaniali rapidreferencefreehumangenotypeimputationwithdenoisingautoencoders |