Cargando…
NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data
RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9530720/ https://www.ncbi.nlm.nih.gov/pubmed/36002304 http://dx.doi.org/10.1264/jsme2.ME22001 |
_version_ | 1784801747466715136 |
---|---|
author | Sakaguchi, Shoichi Urayama, Syun-ichi Takaki, Yoshihiro Hirosuna, Kensuke Wu, Hong Suzuki, Youichi Nunoura, Takuro Nakano, Takashi Nakagawa, So |
author_facet | Sakaguchi, Shoichi Urayama, Syun-ichi Takaki, Yoshihiro Hirosuna, Kensuke Wu, Hong Suzuki, Youichi Nunoura, Takuro Nakano, Takashi Nakagawa, So |
author_sort | Sakaguchi, Shoichi |
collection | PubMed |
description | RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue, we created a dataset of RNA-dependent RNA polymerase (RdRp) domains that are essential for all RNA viruses belonging to Orthornavirae. Genes with RdRp domains from various RNA viruses were clustered based on amino acid sequence similarities. A multiple sequence alignment was generated for each cluster, and a hidden Markov model (HMM) profile was created when the number of sequences was greater than three. We further refined 426 HMM profiles by detecting RefSeq RNA virus sequences and subsequently combined the hit sequences with the RdRp domains. As a result, 1,182 HMM profiles were generated from 12,502 RdRp domain sequences, and the dataset was named NeoRdRp. The majority of NeoRdRp HMM profiles successfully detected RdRp domains, specifically in the UniProt dataset. Furthermore, we compared the NeoRdRp dataset with two previously reported methods for RNA virus detection using metatranscriptome sequencing data. Our methods successfully identified the majority of RNA viruses in the datasets; however, some RNA viruses were not detected, similar to the other two methods. NeoRdRp may be repeatedly improved by the addition of new RdRp sequences and is applicable as a system for detecting various RNA viruses from diverse metatranscriptome data. |
format | Online Article Text |
id | pubmed-9530720 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles |
record_format | MEDLINE/PubMed |
spelling | pubmed-95307202022-10-12 NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data Sakaguchi, Shoichi Urayama, Syun-ichi Takaki, Yoshihiro Hirosuna, Kensuke Wu, Hong Suzuki, Youichi Nunoura, Takuro Nakano, Takashi Nakagawa, So Microbes Environ Regular Paper RNA viruses are distributed throughout various environments, and most have recently been identified by metatranscriptome sequencing. However, due to the high nucleotide diversity of RNA viruses, it is still challenging to identify novel RNA viruses from metatranscriptome data. To overcome this issue, we created a dataset of RNA-dependent RNA polymerase (RdRp) domains that are essential for all RNA viruses belonging to Orthornavirae. Genes with RdRp domains from various RNA viruses were clustered based on amino acid sequence similarities. A multiple sequence alignment was generated for each cluster, and a hidden Markov model (HMM) profile was created when the number of sequences was greater than three. We further refined 426 HMM profiles by detecting RefSeq RNA virus sequences and subsequently combined the hit sequences with the RdRp domains. As a result, 1,182 HMM profiles were generated from 12,502 RdRp domain sequences, and the dataset was named NeoRdRp. The majority of NeoRdRp HMM profiles successfully detected RdRp domains, specifically in the UniProt dataset. Furthermore, we compared the NeoRdRp dataset with two previously reported methods for RNA virus detection using metatranscriptome sequencing data. Our methods successfully identified the majority of RNA viruses in the datasets; however, some RNA viruses were not detected, similar to the other two methods. NeoRdRp may be repeatedly improved by the addition of new RdRp sequences and is applicable as a system for detecting various RNA viruses from diverse metatranscriptome data. Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles 2022 2022-08-24 /pmc/articles/PMC9530720/ /pubmed/36002304 http://dx.doi.org/10.1264/jsme2.ME22001 Text en 2022 by Japanese Society of Microbial Ecology / Japanese Society of Soil Microbiology / Taiwan Society of Microbial Ecology / Japanese Society of Plant Microbe Interactions / Japanese Society for Extremophiles. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Regular Paper Sakaguchi, Shoichi Urayama, Syun-ichi Takaki, Yoshihiro Hirosuna, Kensuke Wu, Hong Suzuki, Youichi Nunoura, Takuro Nakano, Takashi Nakagawa, So NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title | NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title_full | NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title_fullStr | NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title_full_unstemmed | NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title_short | NeoRdRp: A Comprehensive Dataset for Identifying RNA-dependent RNA Polymerases of Various RNA Viruses from Metatranscriptomic Data |
title_sort | neordrp: a comprehensive dataset for identifying rna-dependent rna polymerases of various rna viruses from metatranscriptomic data |
topic | Regular Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9530720/ https://www.ncbi.nlm.nih.gov/pubmed/36002304 http://dx.doi.org/10.1264/jsme2.ME22001 |
work_keys_str_mv | AT sakaguchishoichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT urayamasyunichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT takakiyoshihiro neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT hirosunakensuke neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT wuhong neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT suzukiyouichi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT nunouratakuro neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT nakanotakashi neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata AT nakagawaso neordrpacomprehensivedatasetforidentifyingrnadependentrnapolymerasesofvariousrnavirusesfrommetatranscriptomicdata |