Cargando…

VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data

High-throughput sequencing has provided the capacity of broad virus detection for both known and unknown viruses in a variety of hosts and habitats. It has been successfully applied for novel virus discovery in many agricultural crops, leading to the current drive to apply this technology routinely...

Descripción completa

Detalles Bibliográficos
Autores principales: Sukhorukov, Grigorii, Khalili, Maryam, Gascuel, Olivier, Candresse, Thierry, Marais-Colombel, Armelle, Nikolski, Macha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580956/
https://www.ncbi.nlm.nih.gov/pubmed/36304258
http://dx.doi.org/10.3389/fbinf.2022.867111
_version_ 1784812509020028928
author Sukhorukov, Grigorii
Khalili, Maryam
Gascuel, Olivier
Candresse, Thierry
Marais-Colombel, Armelle
Nikolski, Macha
author_facet Sukhorukov, Grigorii
Khalili, Maryam
Gascuel, Olivier
Candresse, Thierry
Marais-Colombel, Armelle
Nikolski, Macha
author_sort Sukhorukov, Grigorii
collection PubMed
description High-throughput sequencing has provided the capacity of broad virus detection for both known and unknown viruses in a variety of hosts and habitats. It has been successfully applied for novel virus discovery in many agricultural crops, leading to the current drive to apply this technology routinely for plant health diagnostics. For this, efficient and precise methods for sequencing-based virus detection and discovery are essential. However, both existing alignment-based methods relying on reference databases and even more recent machine learning approaches are not efficient enough in detecting unknown viruses in RNAseq datasets of plant viromes. We present VirHunter, a deep learning convolutional neural network approach, to detect novel and known viruses in assemblies of sequencing datasets. While our method is generally applicable to a variety of viruses, here, we trained and evaluated it specifically for RNA viruses by reinforcing the coding sequences’ content in the training dataset. Trained on the NCBI plant viruses data for three different host species (peach, grapevine, and sugar beet), VirHunter outperformed the state-of-the-art method, DeepVirFinder, for the detection of novel viruses, both in the synthetic leave-out setting and on the 12 newly acquired RNAseq datasets. Compared with the traditional tBLASTx approach, VirHunter has consistently exhibited better results in the majority of leave-out experiments. In conclusion, we have shown that VirHunter can be used to streamline the analyses of plant HTS-acquired viromes and is particularly well suited for the detection of novel viral contigs, in RNAseq datasets.
format Online
Article
Text
id pubmed-9580956
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95809562022-10-26 VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data Sukhorukov, Grigorii Khalili, Maryam Gascuel, Olivier Candresse, Thierry Marais-Colombel, Armelle Nikolski, Macha Front Bioinform Bioinformatics High-throughput sequencing has provided the capacity of broad virus detection for both known and unknown viruses in a variety of hosts and habitats. It has been successfully applied for novel virus discovery in many agricultural crops, leading to the current drive to apply this technology routinely for plant health diagnostics. For this, efficient and precise methods for sequencing-based virus detection and discovery are essential. However, both existing alignment-based methods relying on reference databases and even more recent machine learning approaches are not efficient enough in detecting unknown viruses in RNAseq datasets of plant viromes. We present VirHunter, a deep learning convolutional neural network approach, to detect novel and known viruses in assemblies of sequencing datasets. While our method is generally applicable to a variety of viruses, here, we trained and evaluated it specifically for RNA viruses by reinforcing the coding sequences’ content in the training dataset. Trained on the NCBI plant viruses data for three different host species (peach, grapevine, and sugar beet), VirHunter outperformed the state-of-the-art method, DeepVirFinder, for the detection of novel viruses, both in the synthetic leave-out setting and on the 12 newly acquired RNAseq datasets. Compared with the traditional tBLASTx approach, VirHunter has consistently exhibited better results in the majority of leave-out experiments. In conclusion, we have shown that VirHunter can be used to streamline the analyses of plant HTS-acquired viromes and is particularly well suited for the detection of novel viral contigs, in RNAseq datasets. Frontiers Media S.A. 2022-05-13 /pmc/articles/PMC9580956/ /pubmed/36304258 http://dx.doi.org/10.3389/fbinf.2022.867111 Text en Copyright © 2022 Sukhorukov, Khalili, Gascuel, Candresse, Marais-Colombel and Nikolski. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Sukhorukov, Grigorii
Khalili, Maryam
Gascuel, Olivier
Candresse, Thierry
Marais-Colombel, Armelle
Nikolski, Macha
VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title_full VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title_fullStr VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title_full_unstemmed VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title_short VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
title_sort virhunter: a deep learning-based method for detection of novel rna viruses in plant sequencing data
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580956/
https://www.ncbi.nlm.nih.gov/pubmed/36304258
http://dx.doi.org/10.3389/fbinf.2022.867111
work_keys_str_mv AT sukhorukovgrigorii virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata
AT khalilimaryam virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata
AT gascuelolivier virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata
AT candressethierry virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata
AT maraiscolombelarmelle virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata
AT nikolskimacha virhunteradeeplearningbasedmethodfordetectionofnovelrnavirusesinplantsequencingdata