Cargando…

Effect of sequence depth and length in long-read assembly of the maize inbred NC358

Improvements in long-read data and scaffolding technologies have enabled rapid generation of reference-quality assemblies for complex genomes. Still, an assessment of critical sequence depth and read length is important for allocating limited resources. To this end, we have generated eight assemblie...

Descripción completa

Detalles Bibliográficos
Autores principales: Ou, Shujun, Liu, Jianing, Chougule, Kapeel M., Fungtammasan, Arkarachai, Seetharam, Arun S., Stein, Joshua C., Llaca, Victor, Manchanda, Nancy, Gilbert, Amanda M., Wei, Sharon, Chin, Chen-Shan, Hufnagel, David E., Pedersen, Sarah, Snodgrass, Samantha J., Fengler, Kevin, Woodhouse, Margaret, Walenz, Brian P., Koren, Sergey, Phillippy, Adam M., Hannigan, Brett T., Dawe, R. Kelly, Hirsch, Candice N., Hufford, Matthew B., Ware, Doreen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7211024/
https://www.ncbi.nlm.nih.gov/pubmed/32385271
http://dx.doi.org/10.1038/s41467-020-16037-7
_version_ 1783531381050048512
author Ou, Shujun
Liu, Jianing
Chougule, Kapeel M.
Fungtammasan, Arkarachai
Seetharam, Arun S.
Stein, Joshua C.
Llaca, Victor
Manchanda, Nancy
Gilbert, Amanda M.
Wei, Sharon
Chin, Chen-Shan
Hufnagel, David E.
Pedersen, Sarah
Snodgrass, Samantha J.
Fengler, Kevin
Woodhouse, Margaret
Walenz, Brian P.
Koren, Sergey
Phillippy, Adam M.
Hannigan, Brett T.
Dawe, R. Kelly
Hirsch, Candice N.
Hufford, Matthew B.
Ware, Doreen
author_facet Ou, Shujun
Liu, Jianing
Chougule, Kapeel M.
Fungtammasan, Arkarachai
Seetharam, Arun S.
Stein, Joshua C.
Llaca, Victor
Manchanda, Nancy
Gilbert, Amanda M.
Wei, Sharon
Chin, Chen-Shan
Hufnagel, David E.
Pedersen, Sarah
Snodgrass, Samantha J.
Fengler, Kevin
Woodhouse, Margaret
Walenz, Brian P.
Koren, Sergey
Phillippy, Adam M.
Hannigan, Brett T.
Dawe, R. Kelly
Hirsch, Candice N.
Hufford, Matthew B.
Ware, Doreen
author_sort Ou, Shujun
collection PubMed
description Improvements in long-read data and scaffolding technologies have enabled rapid generation of reference-quality assemblies for complex genomes. Still, an assessment of critical sequence depth and read length is important for allocating limited resources. To this end, we have generated eight assemblies for the complex genome of the maize inbred line NC358 using PacBio datasets ranging from 20 to 75 × genomic depth and with N50 subread lengths of 11–21 kb. Assemblies with ≤30 × depth and N50 subread length of 11 kb are highly fragmented, with even low-copy genic regions showing degradation at 20 × depth. Distinct sequence-quality thresholds are observed for complete assembly of genes, transposable elements, and highly repetitive genomic features such as telomeres, heterochromatic knobs, and centromeres. In addition, we show high-quality optical maps can dramatically improve contiguity in even our most fragmented base assembly. This study provides a useful resource allocation reference to the community as long-read technologies continue to mature.
format Online
Article
Text
id pubmed-7211024
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-72110242020-05-13 Effect of sequence depth and length in long-read assembly of the maize inbred NC358 Ou, Shujun Liu, Jianing Chougule, Kapeel M. Fungtammasan, Arkarachai Seetharam, Arun S. Stein, Joshua C. Llaca, Victor Manchanda, Nancy Gilbert, Amanda M. Wei, Sharon Chin, Chen-Shan Hufnagel, David E. Pedersen, Sarah Snodgrass, Samantha J. Fengler, Kevin Woodhouse, Margaret Walenz, Brian P. Koren, Sergey Phillippy, Adam M. Hannigan, Brett T. Dawe, R. Kelly Hirsch, Candice N. Hufford, Matthew B. Ware, Doreen Nat Commun Article Improvements in long-read data and scaffolding technologies have enabled rapid generation of reference-quality assemblies for complex genomes. Still, an assessment of critical sequence depth and read length is important for allocating limited resources. To this end, we have generated eight assemblies for the complex genome of the maize inbred line NC358 using PacBio datasets ranging from 20 to 75 × genomic depth and with N50 subread lengths of 11–21 kb. Assemblies with ≤30 × depth and N50 subread length of 11 kb are highly fragmented, with even low-copy genic regions showing degradation at 20 × depth. Distinct sequence-quality thresholds are observed for complete assembly of genes, transposable elements, and highly repetitive genomic features such as telomeres, heterochromatic knobs, and centromeres. In addition, we show high-quality optical maps can dramatically improve contiguity in even our most fragmented base assembly. This study provides a useful resource allocation reference to the community as long-read technologies continue to mature. Nature Publishing Group UK 2020-05-08 /pmc/articles/PMC7211024/ /pubmed/32385271 http://dx.doi.org/10.1038/s41467-020-16037-7 Text en © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Ou, Shujun
Liu, Jianing
Chougule, Kapeel M.
Fungtammasan, Arkarachai
Seetharam, Arun S.
Stein, Joshua C.
Llaca, Victor
Manchanda, Nancy
Gilbert, Amanda M.
Wei, Sharon
Chin, Chen-Shan
Hufnagel, David E.
Pedersen, Sarah
Snodgrass, Samantha J.
Fengler, Kevin
Woodhouse, Margaret
Walenz, Brian P.
Koren, Sergey
Phillippy, Adam M.
Hannigan, Brett T.
Dawe, R. Kelly
Hirsch, Candice N.
Hufford, Matthew B.
Ware, Doreen
Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title_full Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title_fullStr Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title_full_unstemmed Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title_short Effect of sequence depth and length in long-read assembly of the maize inbred NC358
title_sort effect of sequence depth and length in long-read assembly of the maize inbred nc358
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7211024/
https://www.ncbi.nlm.nih.gov/pubmed/32385271
http://dx.doi.org/10.1038/s41467-020-16037-7
work_keys_str_mv AT oushujun effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT liujianing effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT chougulekapeelm effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT fungtammasanarkarachai effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT seetharamaruns effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT steinjoshuac effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT llacavictor effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT manchandanancy effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT gilbertamandam effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT weisharon effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT chinchenshan effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT hufnageldavide effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT pedersensarah effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT snodgrasssamanthaj effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT fenglerkevin effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT woodhousemargaret effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT walenzbrianp effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT korensergey effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT phillippyadamm effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT hanniganbrettt effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT dawerkelly effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT hirschcandicen effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT huffordmatthewb effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358
AT waredoreen effectofsequencedepthandlengthinlongreadassemblyofthemaizeinbrednc358