Cargando…
Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%)...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776680/ https://www.ncbi.nlm.nih.gov/pubmed/31406327 http://dx.doi.org/10.1038/s41587-019-0217-9 |
_version_ | 1783456492726255616 |
---|---|
author | Wenger, Aaron M. Peluso, Paul Rowell, William J. Chang, Pi-Chuan Hall, Richard J. Concepcion, Gregory T. Ebler, Jana Fungtammasan, Arkarachai Kolesnikov, Alexey Olson, Nathan D. Töpfer, Armin Alonge, Michael Mahmoud, Medhat Qian, Yufeng Chin, Chen-Shan Phillippy, Adam M. Schatz, Michael C. Myers, Gene DePristo, Mark A. Ruan, Jue Marschall, Tobias Sedlazeck, Fritz J. Zook, Justin M. Li, Heng Koren, Sergey Carroll, Andrew Rank, David R. Hunkapiller, Michael W. |
author_facet | Wenger, Aaron M. Peluso, Paul Rowell, William J. Chang, Pi-Chuan Hall, Richard J. Concepcion, Gregory T. Ebler, Jana Fungtammasan, Arkarachai Kolesnikov, Alexey Olson, Nathan D. Töpfer, Armin Alonge, Michael Mahmoud, Medhat Qian, Yufeng Chin, Chen-Shan Phillippy, Adam M. Schatz, Michael C. Myers, Gene DePristo, Mark A. Ruan, Jue Marschall, Tobias Sedlazeck, Fritz J. Zook, Justin M. Li, Heng Koren, Sergey Carroll, Andrew Rank, David R. Hunkapiller, Michael W. |
author_sort | Wenger, Aaron M. |
collection | PubMed |
description | The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the ‘genome in a bottle’ (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads. |
format | Online Article Text |
id | pubmed-6776680 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-67766802020-02-12 Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome Wenger, Aaron M. Peluso, Paul Rowell, William J. Chang, Pi-Chuan Hall, Richard J. Concepcion, Gregory T. Ebler, Jana Fungtammasan, Arkarachai Kolesnikov, Alexey Olson, Nathan D. Töpfer, Armin Alonge, Michael Mahmoud, Medhat Qian, Yufeng Chin, Chen-Shan Phillippy, Adam M. Schatz, Michael C. Myers, Gene DePristo, Mark A. Ruan, Jue Marschall, Tobias Sedlazeck, Fritz J. Zook, Justin M. Li, Heng Koren, Sergey Carroll, Andrew Rank, David R. Hunkapiller, Michael W. Nat Biotechnol Article The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the ‘genome in a bottle’ (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads. 2019-08-12 2019-10 /pmc/articles/PMC6776680/ /pubmed/31406327 http://dx.doi.org/10.1038/s41587-019-0217-9 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Wenger, Aaron M. Peluso, Paul Rowell, William J. Chang, Pi-Chuan Hall, Richard J. Concepcion, Gregory T. Ebler, Jana Fungtammasan, Arkarachai Kolesnikov, Alexey Olson, Nathan D. Töpfer, Armin Alonge, Michael Mahmoud, Medhat Qian, Yufeng Chin, Chen-Shan Phillippy, Adam M. Schatz, Michael C. Myers, Gene DePristo, Mark A. Ruan, Jue Marschall, Tobias Sedlazeck, Fritz J. Zook, Justin M. Li, Heng Koren, Sergey Carroll, Andrew Rank, David R. Hunkapiller, Michael W. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title | Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title_full | Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title_fullStr | Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title_full_unstemmed | Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title_short | Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
title_sort | accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776680/ https://www.ncbi.nlm.nih.gov/pubmed/31406327 http://dx.doi.org/10.1038/s41587-019-0217-9 |
work_keys_str_mv | AT wengeraaronm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT pelusopaul accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT rowellwilliamj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT changpichuan accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT hallrichardj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT concepciongregoryt accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT eblerjana accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT fungtammasanarkarachai accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT kolesnikovalexey accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT olsonnathand accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT topferarmin accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT alongemichael accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT mahmoudmedhat accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT qianyufeng accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT chinchenshan accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT phillippyadamm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT schatzmichaelc accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT myersgene accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT depristomarka accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT ruanjue accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT marschalltobias accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT sedlazeckfritzj accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT zookjustinm accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT liheng accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT korensergey accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT carrollandrew accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT rankdavidr accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome AT hunkapillermichaelw accuratecircularconsensuslongreadsequencingimprovesvariantdetectionandassemblyofahumangenome |