Cargando…

RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data

With advances in library construction protocols and next-generation sequencing technologies, viral metagenomic sequencing has become the major source for novel virus discovery. Conducting taxonomic classification for metagenomic data is an important means to characterize the viral composition in the...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Xubo, Shang, Jiayu, Sun, Yanni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921650/
https://www.ncbi.nlm.nih.gov/pubmed/35136930
http://dx.doi.org/10.1093/bib/bbac011
_version_ 1784669365373763584
author Tang, Xubo
Shang, Jiayu
Sun, Yanni
author_facet Tang, Xubo
Shang, Jiayu
Sun, Yanni
author_sort Tang, Xubo
collection PubMed
description With advances in library construction protocols and next-generation sequencing technologies, viral metagenomic sequencing has become the major source for novel virus discovery. Conducting taxonomic classification for metagenomic data is an important means to characterize the viral composition in the underlying samples. However, RNA viruses are abundant and highly diverse, jeopardizing the sensitivity of comparison-based classification methods. To improve the sensitivity of read-level taxonomic classification, we developed an RNA-dependent RNA polymerase (RdRp) gene-based read classification tool RdRpBin. It combines alignment-based strategy with machine learning models in order to fully exploit the sequence properties of RdRp. We tested our method and compared its performance with the state-of-the-art tools on the simulated and real sequencing data. RdRpBin competes favorably with all. In particular, when the query RNA viruses share low sequence similarity with the known viruses ([Formula: see text]), our tool can still maintain a higher F-score than the state-of-the-art tools. The experimental results on real data also showed that RdRpBin can classify more RNA viral reads with a relatively low false-positive rate. Thus, RdRpBin can be utilized to classify novel and diverged RNA viruses.
format Online
Article
Text
id pubmed-8921650
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89216502022-03-15 RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data Tang, Xubo Shang, Jiayu Sun, Yanni Brief Bioinform Problem Solving Protocol With advances in library construction protocols and next-generation sequencing technologies, viral metagenomic sequencing has become the major source for novel virus discovery. Conducting taxonomic classification for metagenomic data is an important means to characterize the viral composition in the underlying samples. However, RNA viruses are abundant and highly diverse, jeopardizing the sensitivity of comparison-based classification methods. To improve the sensitivity of read-level taxonomic classification, we developed an RNA-dependent RNA polymerase (RdRp) gene-based read classification tool RdRpBin. It combines alignment-based strategy with machine learning models in order to fully exploit the sequence properties of RdRp. We tested our method and compared its performance with the state-of-the-art tools on the simulated and real sequencing data. RdRpBin competes favorably with all. In particular, when the query RNA viruses share low sequence similarity with the known viruses ([Formula: see text]), our tool can still maintain a higher F-score than the state-of-the-art tools. The experimental results on real data also showed that RdRpBin can classify more RNA viral reads with a relatively low false-positive rate. Thus, RdRpBin can be utilized to classify novel and diverged RNA viruses. Oxford University Press 2022-02-07 /pmc/articles/PMC8921650/ /pubmed/35136930 http://dx.doi.org/10.1093/bib/bbac011 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Tang, Xubo
Shang, Jiayu
Sun, Yanni
RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title_full RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title_fullStr RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title_full_unstemmed RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title_short RdRp-based sensitive taxonomic classification of RNA viruses for metagenomic data
title_sort rdrp-based sensitive taxonomic classification of rna viruses for metagenomic data
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921650/
https://www.ncbi.nlm.nih.gov/pubmed/35136930
http://dx.doi.org/10.1093/bib/bbac011
work_keys_str_mv AT tangxubo rdrpbasedsensitivetaxonomicclassificationofrnavirusesformetagenomicdata
AT shangjiayu rdrpbasedsensitivetaxonomicclassificationofrnavirusesformetagenomicdata
AT sunyanni rdrpbasedsensitivetaxonomicclassificationofrnavirusesformetagenomicdata