Cargando…

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%)...

Descripción completa

Detalles Bibliográficos
Autores principales: Wenger, Aaron M., Peluso, Paul, Rowell, William J., Chang, Pi-Chuan, Hall, Richard J., Concepcion, Gregory T., Ebler, Jana, Fungtammasan, Arkarachai, Kolesnikov, Alexey, Olson, Nathan D., Töpfer, Armin, Alonge, Michael, Mahmoud, Medhat, Qian, Yufeng, Chin, Chen-Shan, Phillippy, Adam M., Schatz, Michael C., Myers, Gene, DePristo, Mark A., Ruan, Jue, Marschall, Tobias, Sedlazeck, Fritz J., Zook, Justin M., Li, Heng, Koren, Sergey, Carroll, Andrew, Rank, David R., Hunkapiller, Michael W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776680/
https://www.ncbi.nlm.nih.gov/pubmed/31406327
http://dx.doi.org/10.1038/s41587-019-0217-9
_version_ 1783456492726255616
author Wenger, Aaron M.
Peluso, Paul
Rowell, William J.
Chang, Pi-Chuan
Hall, Richard J.
Concepcion, Gregory T.
Ebler, Jana
Fungtammasan, Arkarachai
Kolesnikov, Alexey
Olson, Nathan D.
Töpfer, Armin
Alonge, Michael
Mahmoud, Medhat
Qian, Yufeng
Chin, Chen-Shan
Phillippy, Adam M.
Schatz, Michael C.
Myers, Gene
DePristo, Mark A.
Ruan, Jue
Marschall, Tobias
Sedlazeck, Fritz J.
Zook, Justin M.
Li, Heng
Koren, Sergey
Carroll, Andrew
Rank, David R.
Hunkapiller, Michael W.
author_facet Wenger, Aaron M.
Peluso, Paul
Rowell, William J.
Chang, Pi-Chuan
Hall, Richard J.
Concepcion, Gregory T.
Ebler, Jana
Fungtammasan, Arkarachai
Kolesnikov, Alexey
Olson, Nathan D.
Töpfer, Armin
Alonge, Michael
Mahmoud, Medhat
Qian, Yufeng
Chin, Chen-Shan
Phillippy, Adam M.
Schatz, Michael C.
Myers, Gene
DePristo, Mark A.
Ruan, Jue
Marschall, Tobias
Sedlazeck, Fritz J.
Zook, Justin M.
Li, Heng
Koren, Sergey
Carroll, Andrew
Rank, David R.
Hunkapiller, Michael W.
author_sort Wenger, Aaron M.
collection PubMed
description The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the ‘genome in a bottle’ (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.
format Online
Article
Text
id pubmed-6776680
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-67766802020-02-12 Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome Wenger, Aaron M. Peluso, Paul Rowell, William J. Chang, Pi-Chuan Hall, Richard J. Concepcion, Gregory T. Ebler, Jana Fungtammasan, Arkarachai Kolesnikov, Alexey Olson, Nathan D. Töpfer, Armin Alonge, Michael Mahmoud, Medhat Qian, Yufeng Chin, Chen-Shan Phillippy, Adam M. Schatz, Michael C. Myers, Gene DePristo, Mark A. Ruan, Jue Marschall, Tobias Sedlazeck, Fritz J. Zook, Justin M. Li, Heng Koren, Sergey Carroll, Andrew Rank, David R. Hunkapiller, Michael W. Nat Biotechnol Article The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the ‘genome in a bottle’ (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads. 2019-08-12 2019-10 /pmc/articles/PMC6776680/ /pubmed/31406327 http://dx.doi.org/10.1038/s41587-019-0217-9 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Wenger, Aaron M.
Peluso, Paul
Rowell, William J.
Chang, Pi-Chuan
Hall, Richard J.
Concepcion, Gregory T.
Ebler, Jana
Fungtammasan, Arkarachai
Kolesnikov, Alexey
Olson, Nathan D.
Töpfer, Armin
Alonge, Michael
Mahmoud, Medhat
Qian, Yufeng
Chin, Chen-Shan
Phillippy, Adam M.
Schatz, Michael C.
Myers, Gene
DePristo, Mark A.
Ruan, Jue
Marschall, Tobias
Sedlazeck, Fritz J.
Zook, Justin M.
Li, Heng
Koren, Sergey
Carroll, Andrew
Rank, David R.
Hunkapiller, Michael W.
Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title_full Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title_fullStr Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title_full_unstemmed Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title_short Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
title_sort accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776680/
https://www.ncbi.nlm.nih.gov/pubmed/31406327
http://dx.doi.org/10.1038/s41587-019-0217-9
work_keys_str_mv AT wengeraaronm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT pelusopaul accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT rowellwilliamj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT changpichuan accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT hallrichardj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT concepciongregoryt accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT eblerjana accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT fungtammasanarkarachai accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT kolesnikovalexey accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT olsonnathand accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT topferarmin accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT alongemichael accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT mahmoudmedhat accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT qianyufeng accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT chinchenshan accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT phillippyadamm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT schatzmichaelc accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT myersgene accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT depristomarka accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT ruanjue accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT marschalltobias accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT sedlazeckfritzj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT zookjustinm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT liheng accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT korensergey accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT carrollandrew accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT rankdavidr accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome
AT hunkapillermichaelw accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome