Cargando…

NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data

RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue...

Descripción completa

Detalles Bibliográficos
Autores principales: Sakaguchi, Shoichi, Urayama, Syun-ichi, Takaki, Yoshihiro, Hirosuna, Kensuke, Wu, Hong, Suzuki, Youichi, Nunoura, Takuro, Nakano, Takashi, Nakagawa, So
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9530720/
https://www.ncbi.nlm.nih.gov/pubmed/36002304
http://dx.doi.org/10.1264/jsme2.ME22001
_version_ 1784801747466715136
author Sakaguchi, Shoichi
Urayama, Syun-ichi
Takaki, Yoshihiro
Hirosuna, Kensuke
Wu, Hong
Suzuki, Youichi
Nunoura, Takuro
Nakano, Takashi
Nakagawa, So
author_facet Sakaguchi, Shoichi
Urayama, Syun-ichi
Takaki, Yoshihiro
Hirosuna, Kensuke
Wu, Hong
Suzuki, Youichi
Nunoura, Takuro
Nakano, Takashi
Nakagawa, So
author_sort Sakaguchi, Shoichi
collection PubMed
description RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue, we created a dataset of RNA-dependent RNA polymerase (RdRp) domains that are essential for all RNA viruses belonging to Orthornavirae. Genes with RdRp domains from various RNA viruses were clustered based on amino acid sequence similarities. A multiple sequence alignment was generated for each cluster, and a hidden Markov model (HMM) profile was created when the number of sequences was greater than three. We further refined 426 HMM profiles by detecting RefSeq RNA virus sequences and subsequently combined the hit sequences with the RdRp domains. As a result, 1,182 HMM profiles were generated from 12,502 RdRp domain sequences, and the dataset was named NeoRdRp. The majority of NeoRdRp HMM profiles successfully detected RdRp domains, specifically in the UniProt dataset. Furthermore, we compared the NeoRdRp dataset with two previously reported methods for RNA virus detection using metatranscriptome sequencing data. Our methods successfully identified the majority of RNA viruses in the datasets; however, some RNA viruses were not detected, similar to the other two methods. NeoRdRp may be repeatedly improved by the addition of new RdRp sequences and is applicable as a system for detecting various RNA viruses from diverse metatranscriptome data.
format Online
Article
Text
id pubmed-9530720
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles
record_format MEDLINE/PubMed
spelling pubmed-95307202022-10-12 NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data Sakaguchi, Shoichi Urayama, Syun-ichi Takaki, Yoshihiro Hirosuna, Kensuke Wu, Hong Suzuki, Youichi Nunoura, Takuro Nakano, Takashi Nakagawa, So Microbes Environ Regular Paper RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue, we created a dataset of RNA-dependent RNA polymerase (RdRp) domains that are essential for all RNA viruses belonging to Orthornavirae. Genes with RdRp domains from various RNA viruses were clustered based on amino acid sequence similarities. A multiple sequence alignment was generated for each cluster, and a hidden Markov model (HMM) profile was created when the number of sequences was greater than three. We further refined 426 HMM profiles by detecting RefSeq RNA virus sequences and subsequently combined the hit sequences with the RdRp domains. As a result, 1,182 HMM profiles were generated from 12,502 RdRp domain sequences, and the dataset was named NeoRdRp. The majority of NeoRdRp HMM profiles successfully detected RdRp domains, specifically in the UniProt dataset. Furthermore, we compared the NeoRdRp dataset with two previously reported methods for RNA virus detection using metatranscriptome sequencing data. Our methods successfully identified the majority of RNA viruses in the datasets; however, some RNA viruses were not detected, similar to the other two methods. NeoRdRp may be repeatedly improved by the addition of new RdRp sequences and is applicable as a system for detecting various RNA viruses from diverse metatranscriptome data. Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles 2022 2022-08-24 /pmc/articles/PMC9530720/ /pubmed/36002304 http://dx.doi.org/10.1264/jsme2.ME22001 Text en 2022 by Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Paper
Sakaguchi, Shoichi
Urayama, Syun-ichi
Takaki, Yoshihiro
Hirosuna, Kensuke
Wu, Hong
Suzuki, Youichi
Nunoura, Takuro
Nakano, Takashi
Nakagawa, So
NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title_full NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title_fullStr NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title_full_unstemmed NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title_short NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
title_sort neordrp: a comprehensive dataset for identifying rna-dependent rna polymerases of various rna viruses from metatranscriptomic data
topic Regular Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9530720/
https://www.ncbi.nlm.nih.gov/pubmed/36002304
http://dx.doi.org/10.1264/jsme2.ME22001
work_keys_str_mv AT sakaguchishoichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT urayamasyunichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT takakiyoshihiro neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT hirosunakensuke neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT wuhong neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT suzukiyouichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT nunouratakuro neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT nakanotakashi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata
AT nakagawaso neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata