Cargando…

Whole genome SNP genotype piecemeal imputation

BACKGROUND: Despite ongoing reductions in the cost of sequencing technologies, whole genome SNP genotype imputation is often used as an alternative for obtaining abundant SNP genotypes for genome wide association studies. Several existing genotype imputation methods can be efficient for this purpose...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yining, Wylie, Tim, Stothard, Paul, Lin, Guohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619096/
https://www.ncbi.nlm.nih.gov/pubmed/26498158
http://dx.doi.org/10.1186/s12859-015-0770-2
_version_ 1782397039986868224
author Wang, Yining
Wylie, Tim
Stothard, Paul
Lin, Guohui
author_facet Wang, Yining
Wylie, Tim
Stothard, Paul
Lin, Guohui
author_sort Wang, Yining
collection PubMed
description BACKGROUND: Despite ongoing reductions in the cost of sequencing technologies, whole genome SNP genotype imputation is often used as an alternative for obtaining abundant SNP genotypes for genome wide association studies. Several existing genotype imputation methods can be efficient for this purpose, while achieving various levels of imputation accuracy. Recent empirical results have shown that the two-step imputation may improve accuracy by imputing the low density genotyped study animals to a medium density array first and then to the target density. We are interested in building a series of staircase arrays that lead the low density array to the high density array or even the whole genome, such that genotype imputation along these staircases can achieve the highest accuracy. RESULTS: For genotype imputation from a lower density to a higher density, we first show how to select untyped SNPs to construct a medium density array. Subsequently, we determine for each selected SNP those untyped SNPs to be imputed in the add-one two-step imputation, and lastly how the clusters of imputed genotype are pieced together as the final imputation result. We design extensive empirical experiments using several hundred sequenced and genotyped animals to demonstrate that our novel two-step piecemeal imputation always achieves an improvement compared to the one-step imputation by the state-of-the-art methods Beagle and FImpute. Using the two-step piecemeal imputation, we present some preliminary success on whole genome SNP genotype imputation for genotyped animals via a series of staircase arrays. CONCLUSIONS: From a low SNP density to the whole genome, intermediate pseudo-arrays can be computationally constructed by selecting the most informative SNPs for untyped SNP genotype imputation. Such pseudo-array staircases are able to impute more accurately than the classic one-step imputation.
format Online
Article
Text
id pubmed-4619096
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46190962015-10-25 Whole genome SNP genotype piecemeal imputation Wang, Yining Wylie, Tim Stothard, Paul Lin, Guohui BMC Bioinformatics Methodology Article BACKGROUND: Despite ongoing reductions in the cost of sequencing technologies, whole genome SNP genotype imputation is often used as an alternative for obtaining abundant SNP genotypes for genome wide association studies. Several existing genotype imputation methods can be efficient for this purpose, while achieving various levels of imputation accuracy. Recent empirical results have shown that the two-step imputation may improve accuracy by imputing the low density genotyped study animals to a medium density array first and then to the target density. We are interested in building a series of staircase arrays that lead the low density array to the high density array or even the whole genome, such that genotype imputation along these staircases can achieve the highest accuracy. RESULTS: For genotype imputation from a lower density to a higher density, we first show how to select untyped SNPs to construct a medium density array. Subsequently, we determine for each selected SNP those untyped SNPs to be imputed in the add-one two-step imputation, and lastly how the clusters of imputed genotype are pieced together as the final imputation result. We design extensive empirical experiments using several hundred sequenced and genotyped animals to demonstrate that our novel two-step piecemeal imputation always achieves an improvement compared to the one-step imputation by the state-of-the-art methods Beagle and FImpute. Using the two-step piecemeal imputation, we present some preliminary success on whole genome SNP genotype imputation for genotyped animals via a series of staircase arrays. CONCLUSIONS: From a low SNP density to the whole genome, intermediate pseudo-arrays can be computationally constructed by selecting the most informative SNPs for untyped SNP genotype imputation. Such pseudo-array staircases are able to impute more accurately than the classic one-step imputation. BioMed Central 2015-10-23 /pmc/articles/PMC4619096/ /pubmed/26498158 http://dx.doi.org/10.1186/s12859-015-0770-2 Text en © Wang et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Wang, Yining
Wylie, Tim
Stothard, Paul
Lin, Guohui
Whole genome SNP genotype piecemeal imputation
title Whole genome SNP genotype piecemeal imputation
title_full Whole genome SNP genotype piecemeal imputation
title_fullStr Whole genome SNP genotype piecemeal imputation
title_full_unstemmed Whole genome SNP genotype piecemeal imputation
title_short Whole genome SNP genotype piecemeal imputation
title_sort whole genome snp genotype piecemeal imputation
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619096/
https://www.ncbi.nlm.nih.gov/pubmed/26498158
http://dx.doi.org/10.1186/s12859-015-0770-2
work_keys_str_mv AT wangyining wholegenomesnpgenotypepiecemealimputation
AT wylietim wholegenomesnpgenotypepiecemealimputation
AT stothardpaul wholegenomesnpgenotypepiecemealimputation
AT linguohui wholegenomesnpgenotypepiecemealimputation