Cargando…

Completing bacterial genome assemblies: strategy and performance comparisons

Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms hav...

Descripción completa

Detalles Bibliográficos
Autores principales: Liao, Yu-Chieh, Lin, Shu-Hung, Lin, Hsin-Hung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348652/
https://www.ncbi.nlm.nih.gov/pubmed/25735824
http://dx.doi.org/10.1038/srep08747
_version_ 1782359961559367680
author Liao, Yu-Chieh
Lin, Shu-Hung
Lin, Hsin-Hung
author_facet Liao, Yu-Chieh
Lin, Shu-Hung
Lin, Hsin-Hung
author_sort Liao, Yu-Chieh
collection PubMed
description Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches—hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction—have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly.
format Online
Article
Text
id pubmed-4348652
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-43486522015-03-10 Completing bacterial genome assemblies: strategy and performance comparisons Liao, Yu-Chieh Lin, Shu-Hung Lin, Hsin-Hung Sci Rep Article Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches—hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction—have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly. Nature Publishing Group 2015-03-04 /pmc/articles/PMC4348652/ /pubmed/25735824 http://dx.doi.org/10.1038/srep08747 Text en Copyright © 2015, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Liao, Yu-Chieh
Lin, Shu-Hung
Lin, Hsin-Hung
Completing bacterial genome assemblies: strategy and performance comparisons
title Completing bacterial genome assemblies: strategy and performance comparisons
title_full Completing bacterial genome assemblies: strategy and performance comparisons
title_fullStr Completing bacterial genome assemblies: strategy and performance comparisons
title_full_unstemmed Completing bacterial genome assemblies: strategy and performance comparisons
title_short Completing bacterial genome assemblies: strategy and performance comparisons
title_sort completing bacterial genome assemblies: strategy and performance comparisons
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4348652/
https://www.ncbi.nlm.nih.gov/pubmed/25735824
http://dx.doi.org/10.1038/srep08747
work_keys_str_mv AT liaoyuchieh completingbacterialgenomeassembliesstrategyandperformancecomparisons
AT linshuhung completingbacterialgenomeassembliesstrategyandperformancecomparisons
AT linhsinhung completingbacterialgenomeassembliesstrategyandperformancecomparisons