Cargando…

Comparison of De Novo Assembly Strategies for Bacterial Genomes

(1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illum...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Pengfei, Jiang, Dike, Wang, Yin, Yao, Xueping, Luo, Yan, Yang, Zexiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8306402/
https://www.ncbi.nlm.nih.gov/pubmed/34299288
http://dx.doi.org/10.3390/ijms22147668
_version_ 1783727800645058560
author Zhang, Pengfei
Jiang, Dike
Wang, Yin
Yao, Xueping
Luo, Yan
Yang, Zexiao
author_facet Zhang, Pengfei
Jiang, Dike
Wang, Yin
Yao, Xueping
Luo, Yan
Yang, Zexiao
author_sort Zhang, Pengfei
collection PubMed
description (1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for Haemophilus parasuis, which causes Glässer’s disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology.
format Online
Article
Text
id pubmed-8306402
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83064022021-07-25 Comparison of De Novo Assembly Strategies for Bacterial Genomes Zhang, Pengfei Jiang, Dike Wang, Yin Yao, Xueping Luo, Yan Yang, Zexiao Int J Mol Sci Brief Report (1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for Haemophilus parasuis, which causes Glässer’s disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology. MDPI 2021-07-17 /pmc/articles/PMC8306402/ /pubmed/34299288 http://dx.doi.org/10.3390/ijms22147668 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Brief Report
Zhang, Pengfei
Jiang, Dike
Wang, Yin
Yao, Xueping
Luo, Yan
Yang, Zexiao
Comparison of De Novo Assembly Strategies for Bacterial Genomes
title Comparison of De Novo Assembly Strategies for Bacterial Genomes
title_full Comparison of De Novo Assembly Strategies for Bacterial Genomes
title_fullStr Comparison of De Novo Assembly Strategies for Bacterial Genomes
title_full_unstemmed Comparison of De Novo Assembly Strategies for Bacterial Genomes
title_short Comparison of De Novo Assembly Strategies for Bacterial Genomes
title_sort comparison of de novo assembly strategies for bacterial genomes
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8306402/
https://www.ncbi.nlm.nih.gov/pubmed/34299288
http://dx.doi.org/10.3390/ijms22147668
work_keys_str_mv AT zhangpengfei comparisonofdenovoassemblystrategiesforbacterialgenomes
AT jiangdike comparisonofdenovoassemblystrategiesforbacterialgenomes
AT wangyin comparisonofdenovoassemblystrategiesforbacterialgenomes
AT yaoxueping comparisonofdenovoassemblystrategiesforbacterialgenomes
AT luoyan comparisonofdenovoassemblystrategiesforbacterialgenomes
AT yangzexiao comparisonofdenovoassemblystrategiesforbacterialgenomes