Cargando…
AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data
MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucid...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825286/ https://www.ncbi.nlm.nih.gov/pubmed/36610711 http://dx.doi.org/10.1093/bioinformatics/btac827 |
_version_ | 1784866605595885568 |
---|---|
author | Yu, Runzhou Cai, Dehan Sun, Yanni |
author_facet | Yu, Runzhou Cai, Dehan Sun, Yanni |
author_sort | Yu, Runzhou |
collection | PubMed |
description | MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS: In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION: The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9825286 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98252862023-01-09 AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data Yu, Runzhou Cai, Dehan Sun, Yanni Bioinformatics Original Paper MOTIVATION: RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS: In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION: The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-26 /pmc/articles/PMC9825286/ /pubmed/36610711 http://dx.doi.org/10.1093/bioinformatics/btac827 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Yu, Runzhou Cai, Dehan Sun, Yanni AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title | AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title_full | AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title_fullStr | AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title_full_unstemmed | AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title_short | AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data |
title_sort | accuvir: an accurate viral genome assembly tool for third-generation sequencing data |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825286/ https://www.ncbi.nlm.nih.gov/pubmed/36610711 http://dx.doi.org/10.1093/bioinformatics/btac827 |
work_keys_str_mv | AT yurunzhou accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata AT caidehan accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata AT sunyanni accuviranaccurateviralgenomeassemblytoolforthirdgenerationsequencingdata |