Cargando…
Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae)
The assembly of divergent haplotypes using noisy long-read data presents a challenge to the reconstruction of haploid genome assemblies, due to overlapping distributions of technical sequencing error, intralocus genetic variation, and interlocus similarity within these data. Here, we present a compa...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9526047/ https://www.ncbi.nlm.nih.gov/pubmed/35980174 http://dx.doi.org/10.1093/g3journal/jkac210 |
_version_ | 1784800793454444544 |
---|---|
author | Whiteford, Samuel van’t Hof, Arjen E Krishna, Ritesh Marubbi, Thea Widdison, Stephanie Saccheri, Ilik J Guest, Marcus Morrison, Neil I Darby, Alistair C |
author_facet | Whiteford, Samuel van’t Hof, Arjen E Krishna, Ritesh Marubbi, Thea Widdison, Stephanie Saccheri, Ilik J Guest, Marcus Morrison, Neil I Darby, Alistair C |
author_sort | Whiteford, Samuel |
collection | PubMed |
description | The assembly of divergent haplotypes using noisy long-read data presents a challenge to the reconstruction of haploid genome assemblies, due to overlapping distributions of technical sequencing error, intralocus genetic variation, and interlocus similarity within these data. Here, we present a comparative analysis of assembly algorithms representing overlap-layout-consensus, repeat graph, and de Bruijn graph methods. We examine how postprocessing strategies attempting to reduce redundant heterozygosity interact with the choice of initial assembly algorithm and ultimately produce a series of chromosome-level assemblies for an agricultural pest, the diamondback moth, Plutella xylostella (L.). We compare evaluation methods and show that BUSCO analyses may overestimate haplotig removal processing in long-read draft genomes, in comparison to a k-mer method. We discuss the trade-offs inherent in assembly algorithm and curation choices and suggest that “best practice” is research question dependent. We demonstrate a link between allelic divergence and allele-derived contig redundancy in final genome assemblies and document the patterns of coding and noncoding diversity between redundant sequences. We also document a link between an excess of nonsynonymous polymorphism and haplotigs that are unresolved by assembly or postassembly algorithms. Finally, we discuss how this phenomenon may have relevance for the usage of noisy long-read genome assemblies in comparative genomics. |
format | Online Article Text |
id | pubmed-9526047 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-95260472022-10-03 Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) Whiteford, Samuel van’t Hof, Arjen E Krishna, Ritesh Marubbi, Thea Widdison, Stephanie Saccheri, Ilik J Guest, Marcus Morrison, Neil I Darby, Alistair C G3 (Bethesda) Investigation The assembly of divergent haplotypes using noisy long-read data presents a challenge to the reconstruction of haploid genome assemblies, due to overlapping distributions of technical sequencing error, intralocus genetic variation, and interlocus similarity within these data. Here, we present a comparative analysis of assembly algorithms representing overlap-layout-consensus, repeat graph, and de Bruijn graph methods. We examine how postprocessing strategies attempting to reduce redundant heterozygosity interact with the choice of initial assembly algorithm and ultimately produce a series of chromosome-level assemblies for an agricultural pest, the diamondback moth, Plutella xylostella (L.). We compare evaluation methods and show that BUSCO analyses may overestimate haplotig removal processing in long-read draft genomes, in comparison to a k-mer method. We discuss the trade-offs inherent in assembly algorithm and curation choices and suggest that “best practice” is research question dependent. We demonstrate a link between allelic divergence and allele-derived contig redundancy in final genome assemblies and document the patterns of coding and noncoding diversity between redundant sequences. We also document a link between an excess of nonsynonymous polymorphism and haplotigs that are unresolved by assembly or postassembly algorithms. Finally, we discuss how this phenomenon may have relevance for the usage of noisy long-read genome assemblies in comparative genomics. Oxford University Press 2022-08-18 /pmc/articles/PMC9526047/ /pubmed/35980174 http://dx.doi.org/10.1093/g3journal/jkac210 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigation Whiteford, Samuel van’t Hof, Arjen E Krishna, Ritesh Marubbi, Thea Widdison, Stephanie Saccheri, Ilik J Guest, Marcus Morrison, Neil I Darby, Alistair C Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title | Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title_full | Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title_fullStr | Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title_full_unstemmed | Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title_short | Recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (Lepidoptera: Plutellidae) |
title_sort | recovering individual haplotypes and a contiguous genome assembly from pooled long-read sequencing of the diamondback moth (lepidoptera: plutellidae) |
topic | Investigation |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9526047/ https://www.ncbi.nlm.nih.gov/pubmed/35980174 http://dx.doi.org/10.1093/g3journal/jkac210 |
work_keys_str_mv | AT whitefordsamuel recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT vanthofarjene recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT krishnaritesh recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT marubbithea recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT widdisonstephanie recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT saccheriilikj recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT guestmarcus recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT morrisonneili recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae AT darbyalistairc recoveringindividualhaplotypesandacontiguousgenomeassemblyfrompooledlongreadsequencingofthediamondbackmothlepidopteraplutellidae |