Cargando…

A framework for variation discovery and genotyping using next-generation DNA sequencing data

Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to...

Descripción completa

Detalles Bibliográficos
Autores principales: DePristo, M.A., Banks, E., Poplin, R.E., Garimella, K.V., Maguire, J.R., Hartl, C., Philippakis, A.A., del Angel, G., Rivas, M.A, Hanna, M., McKenna, A., Fennell, T.J., Kernytsky, A.M., Sivachenko, A.Y., Cibulskis, K., Gabriel, S.B., Altshuler, D., Daly, M.J.
Formato: Texto
Lenguaje:English
Publicado: 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083463/
https://www.ncbi.nlm.nih.gov/pubmed/21478889
http://dx.doi.org/10.1038/ng.806
_version_ 1782202405635489792
author DePristo, M.A.
Banks, E.
Poplin, R.E.
Garimella, K.V.
Maguire, J.R.
Hartl, C.
Philippakis, A.A.
del Angel, G.
Rivas, M.A
Hanna, M.
McKenna, A.
Fennell, T.J.
Kernytsky, A.M.
Sivachenko, A.Y.
Cibulskis, K.
Gabriel, S.B.
Altshuler, D.
Daly, M.J.
author_facet DePristo, M.A.
Banks, E.
Poplin, R.E.
Garimella, K.V.
Maguire, J.R.
Hartl, C.
Philippakis, A.A.
del Angel, G.
Rivas, M.A
Hanna, M.
McKenna, A.
Fennell, T.J.
Kernytsky, A.M.
Sivachenko, A.Y.
Cibulskis, K.
Gabriel, S.B.
Altshuler, D.
Daly, M.J.
author_sort DePristo, M.A.
collection PubMed
description Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.
format Text
id pubmed-3083463
institution National Center for Biotechnology Information
language English
publishDate 2011
record_format MEDLINE/PubMed
spelling pubmed-30834632011-11-01 A framework for variation discovery and genotyping using next-generation DNA sequencing data DePristo, M.A. Banks, E. Poplin, R.E. Garimella, K.V. Maguire, J.R. Hartl, C. Philippakis, A.A. del Angel, G. Rivas, M.A Hanna, M. McKenna, A. Fennell, T.J. Kernytsky, A.M. Sivachenko, A.Y. Cibulskis, K. Gabriel, S.B. Altshuler, D. Daly, M.J. Nat Genet Article Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets. 2011-04-10 2011-05 /pmc/articles/PMC3083463/ /pubmed/21478889 http://dx.doi.org/10.1038/ng.806 Text en Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
DePristo, M.A.
Banks, E.
Poplin, R.E.
Garimella, K.V.
Maguire, J.R.
Hartl, C.
Philippakis, A.A.
del Angel, G.
Rivas, M.A
Hanna, M.
McKenna, A.
Fennell, T.J.
Kernytsky, A.M.
Sivachenko, A.Y.
Cibulskis, K.
Gabriel, S.B.
Altshuler, D.
Daly, M.J.
A framework for variation discovery and genotyping using next-generation DNA sequencing data
title A framework for variation discovery and genotyping using next-generation DNA sequencing data
title_full A framework for variation discovery and genotyping using next-generation DNA sequencing data
title_fullStr A framework for variation discovery and genotyping using next-generation DNA sequencing data
title_full_unstemmed A framework for variation discovery and genotyping using next-generation DNA sequencing data
title_short A framework for variation discovery and genotyping using next-generation DNA sequencing data
title_sort framework for variation discovery and genotyping using next-generation dna sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3083463/
https://www.ncbi.nlm.nih.gov/pubmed/21478889
http://dx.doi.org/10.1038/ng.806
work_keys_str_mv AT depristoma aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT bankse aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT poplinre aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT garimellakv aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT maguirejr aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT hartlc aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT philippakisaa aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT delangelg aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT rivasma aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT hannam aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT mckennaa aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT fennelltj aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT kernytskyam aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT sivachenkoay aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT cibulskisk aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT gabrielsb aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT altshulerd aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT dalymj aframeworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT depristoma frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT bankse frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT poplinre frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT garimellakv frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT maguirejr frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT hartlc frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT philippakisaa frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT delangelg frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT rivasma frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT hannam frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT mckennaa frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT fennelltj frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT kernytskyam frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT sivachenkoay frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT cibulskisk frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT gabrielsb frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT altshulerd frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata
AT dalymj frameworkforvariationdiscoveryandgenotypingusingnextgenerationdnasequencingdata