Cargando…

Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as m...

Descripción completa

Detalles Bibliográficos
Autores principales: Flannick, Jason, Korn, Joshua M., Fontanillas, Pierre, Grant, George B., Banks, Eric, Depristo, Mark A., Altshuler, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395607/
https://www.ncbi.nlm.nih.gov/pubmed/22807667
http://dx.doi.org/10.1371/journal.pcbi.1002604
_version_ 1782238000124526592
author Flannick, Jason
Korn, Joshua M.
Fontanillas, Pierre
Grant, George B.
Banks, Eric
Depristo, Mark A.
Altshuler, David
author_facet Flannick, Jason
Korn, Joshua M.
Fontanillas, Pierre
Grant, George B.
Banks, Eric
Depristo, Mark A.
Altshuler, David
author_sort Flannick, Jason
collection PubMed
description High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms ([Image: see text]), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling.
format Online
Article
Text
id pubmed-3395607
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33956072012-07-17 Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation Flannick, Jason Korn, Joshua M. Fontanillas, Pierre Grant, George B. Banks, Eric Depristo, Mark A. Altshuler, David PLoS Comput Biol Research Article High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms ([Image: see text]), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling. Public Library of Science 2012-07-12 /pmc/articles/PMC3395607/ /pubmed/22807667 http://dx.doi.org/10.1371/journal.pcbi.1002604 Text en Flannick et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Flannick, Jason
Korn, Joshua M.
Fontanillas, Pierre
Grant, George B.
Banks, Eric
Depristo, Mark A.
Altshuler, David
Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title_full Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title_fullStr Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title_full_unstemmed Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title_short Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation
title_sort efficiency and power as a function of sequence coverage, snp array density, and imputation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395607/
https://www.ncbi.nlm.nih.gov/pubmed/22807667
http://dx.doi.org/10.1371/journal.pcbi.1002604
work_keys_str_mv AT flannickjason efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT kornjoshuam efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT fontanillaspierre efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT grantgeorgeb efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT bankseric efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT depristomarka efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation
AT altshulerdavid efficiencyandpowerasafunctionofsequencecoveragesnparraydensityandimputation