Cargando…

AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data

MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucid...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Runzhou, Cai, Dehan, Sun, Yanni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825286/
https://www.ncbi.nlm.nih.gov/pubmed/36610711
http://dx.doi.org/10.1093/bioinformatics/btac827
_version_ 1784866605595885568
author Yu, Runzhou
Cai, Dehan
Sun, Yanni
author_facet Yu, Runzhou
Cai, Dehan
Sun, Yanni
author_sort Yu, Runzhou
collection PubMed
description MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS: In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION: The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9825286
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98252862023-01-09 AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data Yu, Runzhou Cai, Dehan Sun, Yanni Bioinformatics Original Paper MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS: In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION: The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-26 /pmc/articles/PMC9825286/ /pubmed/36610711 http://dx.doi.org/10.1093/bioinformatics/btac827 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Yu, Runzhou
Cai, Dehan
Sun, Yanni
AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title_full AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title_fullStr AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title_full_unstemmed AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title_short AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
title_sort accuvir: an accurate viral genome assembly tool for third-generation sequencing data
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825286/
https://www.ncbi.nlm.nih.gov/pubmed/36610711
http://dx.doi.org/10.1093/bioinformatics/btac827
work_keys_str_mv AT yurunzhou accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata
AT caidehan accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata
AT sunyanni accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata