Cargando…
Reference-based read clustering improves the de novo genome assembly of microbial strains
Constructing accurate microbial genome assemblies is necessary to understand genetic diversity in microbial genomes and its functional consequences. However, it still remains as a challenging task especially when only short-read sequencing technologies are used. Here, we present a new read-clusterin...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9804104/ https://www.ncbi.nlm.nih.gov/pubmed/36618978 http://dx.doi.org/10.1016/j.csbj.2022.12.032 |
_version_ | 1784862030104100864 |
---|---|
author | Sim, Mikang Lee, Jongin Kwon, Daehong Lee, Daehwan Park, Nayoung Wy, Suyeon Ko, Younhee Kim, Jaebum |
author_facet | Sim, Mikang Lee, Jongin Kwon, Daehong Lee, Daehwan Park, Nayoung Wy, Suyeon Ko, Younhee Kim, Jaebum |
author_sort | Sim, Mikang |
collection | PubMed |
description | Constructing accurate microbial genome assemblies is necessary to understand genetic diversity in microbial genomes and its functional consequences. However, it still remains as a challenging task especially when only short-read sequencing technologies are used. Here, we present a new read-clustering algorithm, called RBRC, for improving de novo microbial genome assembly, by accurately estimating read proximity using multiple reference genomes. The performance of RBRC was confirmed by simulation-based evaluation in terms of assembly contiguity and the number of misassemblies, and was successfully applied to existing fungal and bacterial genomes by improving the quality of the assemblies without using additional sequencing data. RBRC is a very useful read-clustering algorithm that can be used (i) for generating high-quality genome assemblies of microbial strains when genome assemblies of related strains are available, and (ii) for upgrading existing microbial genome assemblies when the generation of additional sequencing data, such as long reads, is difficult. |
format | Online Article Text |
id | pubmed-9804104 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-98041042023-01-05 Reference-based read clustering improves the de novo genome assembly of microbial strains Sim, Mikang Lee, Jongin Kwon, Daehong Lee, Daehwan Park, Nayoung Wy, Suyeon Ko, Younhee Kim, Jaebum Comput Struct Biotechnol J Research Article Constructing accurate microbial genome assemblies is necessary to understand genetic diversity in microbial genomes and its functional consequences. However, it still remains as a challenging task especially when only short-read sequencing technologies are used. Here, we present a new read-clustering algorithm, called RBRC, for improving de novo microbial genome assembly, by accurately estimating read proximity using multiple reference genomes. The performance of RBRC was confirmed by simulation-based evaluation in terms of assembly contiguity and the number of misassemblies, and was successfully applied to existing fungal and bacterial genomes by improving the quality of the assemblies without using additional sequencing data. RBRC is a very useful read-clustering algorithm that can be used (i) for generating high-quality genome assemblies of microbial strains when genome assemblies of related strains are available, and (ii) for upgrading existing microbial genome assemblies when the generation of additional sequencing data, such as long reads, is difficult. Research Network of Computational and Structural Biotechnology 2022-12-21 /pmc/articles/PMC9804104/ /pubmed/36618978 http://dx.doi.org/10.1016/j.csbj.2022.12.032 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Sim, Mikang Lee, Jongin Kwon, Daehong Lee, Daehwan Park, Nayoung Wy, Suyeon Ko, Younhee Kim, Jaebum Reference-based read clustering improves the de novo genome assembly of microbial strains |
title | Reference-based read clustering improves the de novo genome assembly of microbial strains |
title_full | Reference-based read clustering improves the de novo genome assembly of microbial strains |
title_fullStr | Reference-based read clustering improves the de novo genome assembly of microbial strains |
title_full_unstemmed | Reference-based read clustering improves the de novo genome assembly of microbial strains |
title_short | Reference-based read clustering improves the de novo genome assembly of microbial strains |
title_sort | reference-based read clustering improves the de novo genome assembly of microbial strains |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9804104/ https://www.ncbi.nlm.nih.gov/pubmed/36618978 http://dx.doi.org/10.1016/j.csbj.2022.12.032 |
work_keys_str_mv | AT simmikang referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT leejongin referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT kwondaehong referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT leedaehwan referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT parknayoung referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT wysuyeon referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT koyounhee referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains AT kimjaebum referencebasedreadclusteringimprovesthedenovogenomeassemblyofmicrobialstrains |