Cargando…

Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes

De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELE...

Descripción completa

Detalles Bibliográficos
Autores principales: Shafin, Kishwar, Pesout, Trevor, Lorig-Roach, Ryan, Haukness, Marina, Olsen, Hugh E., Bosworth, Colleen, Armstrong, Joel, Tigyi, Kristof, Maurer, Nicholas, Koren, Sergey, Sedlazeck, Fritz J., Marschall, Tobias, Mayes, Simon, Costa, Vania, Zook, Justin M., Liu, Kelvin J., Kilburn, Duncan, Sorensen, Melanie, Munson, Katy M., Vollger, Mitchell R., Monlong, Jean, Garrison, Erik, Eichler, Evan E., Salama, Sofie, Haussler, David, Green, Richard E., Akeson, Mark, Phillippy, Adam, Miga, Karen H., Carnevali, Paolo, Jain, Miten, Paten, Benedict
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7483855/
https://www.ncbi.nlm.nih.gov/pubmed/32686750
http://dx.doi.org/10.1038/s41587-020-0503-6
_version_ 1783580964920754176
author Shafin, Kishwar
Pesout, Trevor
Lorig-Roach, Ryan
Haukness, Marina
Olsen, Hugh E.
Bosworth, Colleen
Armstrong, Joel
Tigyi, Kristof
Maurer, Nicholas
Koren, Sergey
Sedlazeck, Fritz J.
Marschall, Tobias
Mayes, Simon
Costa, Vania
Zook, Justin M.
Liu, Kelvin J.
Kilburn, Duncan
Sorensen, Melanie
Munson, Katy M.
Vollger, Mitchell R.
Monlong, Jean
Garrison, Erik
Eichler, Evan E.
Salama, Sofie
Haussler, David
Green, Richard E.
Akeson, Mark
Phillippy, Adam
Miga, Karen H.
Carnevali, Paolo
Jain, Miten
Paten, Benedict
author_facet Shafin, Kishwar
Pesout, Trevor
Lorig-Roach, Ryan
Haukness, Marina
Olsen, Hugh E.
Bosworth, Colleen
Armstrong, Joel
Tigyi, Kristof
Maurer, Nicholas
Koren, Sergey
Sedlazeck, Fritz J.
Marschall, Tobias
Mayes, Simon
Costa, Vania
Zook, Justin M.
Liu, Kelvin J.
Kilburn, Duncan
Sorensen, Melanie
Munson, Katy M.
Vollger, Mitchell R.
Monlong, Jean
Garrison, Erik
Eichler, Evan E.
Salama, Sofie
Haussler, David
Green, Richard E.
Akeson, Mark
Phillippy, Adam
Miga, Karen H.
Carnevali, Paolo
Jain, Miten
Paten, Benedict
author_sort Shafin, Kishwar
collection PubMed
description De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.
format Online
Article
Text
id pubmed-7483855
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group US
record_format MEDLINE/PubMed
spelling pubmed-74838552020-11-04 Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes Shafin, Kishwar Pesout, Trevor Lorig-Roach, Ryan Haukness, Marina Olsen, Hugh E. Bosworth, Colleen Armstrong, Joel Tigyi, Kristof Maurer, Nicholas Koren, Sergey Sedlazeck, Fritz J. Marschall, Tobias Mayes, Simon Costa, Vania Zook, Justin M. Liu, Kelvin J. Kilburn, Duncan Sorensen, Melanie Munson, Katy M. Vollger, Mitchell R. Monlong, Jean Garrison, Erik Eichler, Evan E. Salama, Sofie Haussler, David Green, Richard E. Akeson, Mark Phillippy, Adam Miga, Karen H. Carnevali, Paolo Jain, Miten Paten, Benedict Nat Biotechnol Article De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed. Nature Publishing Group US 2020-05-04 2020 /pmc/articles/PMC7483855/ /pubmed/32686750 http://dx.doi.org/10.1038/s41587-020-0503-6 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Shafin, Kishwar
Pesout, Trevor
Lorig-Roach, Ryan
Haukness, Marina
Olsen, Hugh E.
Bosworth, Colleen
Armstrong, Joel
Tigyi, Kristof
Maurer, Nicholas
Koren, Sergey
Sedlazeck, Fritz J.
Marschall, Tobias
Mayes, Simon
Costa, Vania
Zook, Justin M.
Liu, Kelvin J.
Kilburn, Duncan
Sorensen, Melanie
Munson, Katy M.
Vollger, Mitchell R.
Monlong, Jean
Garrison, Erik
Eichler, Evan E.
Salama, Sofie
Haussler, David
Green, Richard E.
Akeson, Mark
Phillippy, Adam
Miga, Karen H.
Carnevali, Paolo
Jain, Miten
Paten, Benedict
Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title_full Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title_fullStr Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title_full_unstemmed Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title_short Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
title_sort nanopore sequencing and the shasta toolkit enable efficient de novo assembly of eleven human genomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7483855/
https://www.ncbi.nlm.nih.gov/pubmed/32686750
http://dx.doi.org/10.1038/s41587-020-0503-6
work_keys_str_mv AT shafinkishwar nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT pesouttrevor nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT lorigroachryan nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT hauknessmarina nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT olsenhughe nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT bosworthcolleen nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT armstrongjoel nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT tigyikristof nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT maurernicholas nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT korensergey nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT sedlazeckfritzj nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT marschalltobias nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT mayessimon nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT costavania nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT zookjustinm nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT liukelvinj nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT kilburnduncan nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT sorensenmelanie nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT munsonkatym nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT vollgermitchellr nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT monlongjean nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT garrisonerik nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT eichlerevane nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT salamasofie nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT hausslerdavid nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT greenricharde nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT akesonmark nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT phillippyadam nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT migakarenh nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT carnevalipaolo nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT jainmiten nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes
AT patenbenedict nanoporesequencingandtheshastatoolkitenableefficientdenovoassemblyofelevenhumangenomes