Cargando…

Single molecule sequencing-guided scaffolding and correction of draft assemblies

BACKGROUND: Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Shenglong, Chen, Danny Z., Emrich, Scott J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5731603/
https://www.ncbi.nlm.nih.gov/pubmed/29244003
http://dx.doi.org/10.1186/s12864-017-4271-8
_version_ 1783286533968625664
author Zhu, Shenglong
Chen, Danny Z.
Emrich, Scott J.
author_facet Zhu, Shenglong
Chen, Danny Z.
Emrich, Scott J.
author_sort Zhu, Shenglong
collection PubMed
description BACKGROUND: Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. RESULTS: We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. CONCLUSIONS: Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.
format Online
Article
Text
id pubmed-5731603
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57316032017-12-19 Single molecule sequencing-guided scaffolding and correction of draft assemblies Zhu, Shenglong Chen, Danny Z. Emrich, Scott J. BMC Genomics Research BACKGROUND: Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. RESULTS: We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. CONCLUSIONS: Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks. BioMed Central 2017-12-06 /pmc/articles/PMC5731603/ /pubmed/29244003 http://dx.doi.org/10.1186/s12864-017-4271-8 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Zhu, Shenglong
Chen, Danny Z.
Emrich, Scott J.
Single molecule sequencing-guided scaffolding and correction of draft assemblies
title Single molecule sequencing-guided scaffolding and correction of draft assemblies
title_full Single molecule sequencing-guided scaffolding and correction of draft assemblies
title_fullStr Single molecule sequencing-guided scaffolding and correction of draft assemblies
title_full_unstemmed Single molecule sequencing-guided scaffolding and correction of draft assemblies
title_short Single molecule sequencing-guided scaffolding and correction of draft assemblies
title_sort single molecule sequencing-guided scaffolding and correction of draft assemblies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5731603/
https://www.ncbi.nlm.nih.gov/pubmed/29244003
http://dx.doi.org/10.1186/s12864-017-4271-8
work_keys_str_mv AT zhushenglong singlemoleculesequencingguidedscaffoldingandcorrectionofdraftassemblies
AT chendannyz singlemoleculesequencingguidedscaffoldingandcorrectionofdraftassemblies
AT emrichscottj singlemoleculesequencingguidedscaffoldingandcorrectionofdraftassemblies