Cargando…

Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors

Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these...

Descripción completa

Detalles Bibliográficos
Autores principales: Jagga, Zeenia, Gupta, Dinesh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020838/
https://www.ncbi.nlm.nih.gov/pubmed/24828116
http://dx.doi.org/10.1371/journal.pone.0097446
_version_ 1782316135194034176
author Jagga, Zeenia
Gupta, Dinesh
author_facet Jagga, Zeenia
Gupta, Dinesh
author_sort Jagga, Zeenia
collection PubMed
description Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these proteins is limited by their high sequence diversity. In this study we developed supervised learning based classification models for plant viral RNA silencing suppressor proteins in plant viruses. We developed four classifiers based on supervised learning algorithms: J48, Random Forest, LibSVM and Naïve Bayes algorithms, with enriched model learning by correlation based feature selection. Structural and physicochemical features calculated for experimentally verified primary protein sequences were used to train the classifiers. The training features include amino acid composition; auto correlation coefficients; composition, transition, and distribution of various physicochemical properties; and pseudo amino acid composition. Performance analysis of predictive models based on 10 fold cross-validation and independent data testing revealed that the Random Forest based model was the best and achieved 86.11% overall accuracy and 86.22% balanced accuracy with a remarkably high area under the Receivers Operating Characteristic curve of 0.95 to predict viral RNA silencing suppressor proteins. The prediction models for plant viral RNA silencing suppressors can potentially aid identification of novel viral RNA silencing suppressors, which will provide valuable insights into the mechanism of RNA silencing and could be further explored as potential targets for designing novel antiviral therapeutics. Also, the key subset of identified optimal features may help in determining compositional patterns in the viral proteins which are important determinants for RNA silencing suppressor activities. The best prediction model developed in the study is available as a freely accessible web server pVsupPred at http://bioinfo.icgeb.res.in/pvsup/.
format Online
Article
Text
id pubmed-4020838
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40208382014-05-21 Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors Jagga, Zeenia Gupta, Dinesh PLoS One Research Article Viral encoded RNA silencing suppressor proteins interfere with the host RNA silencing machinery, facilitating viral infection by evading host immunity. In plant hosts, the viral proteins have several basic science implications and biotechnology applications. However in silico identification of these proteins is limited by their high sequence diversity. In this study we developed supervised learning based classification models for plant viral RNA silencing suppressor proteins in plant viruses. We developed four classifiers based on supervised learning algorithms: J48, Random Forest, LibSVM and Naïve Bayes algorithms, with enriched model learning by correlation based feature selection. Structural and physicochemical features calculated for experimentally verified primary protein sequences were used to train the classifiers. The training features include amino acid composition; auto correlation coefficients; composition, transition, and distribution of various physicochemical properties; and pseudo amino acid composition. Performance analysis of predictive models based on 10 fold cross-validation and independent data testing revealed that the Random Forest based model was the best and achieved 86.11% overall accuracy and 86.22% balanced accuracy with a remarkably high area under the Receivers Operating Characteristic curve of 0.95 to predict viral RNA silencing suppressor proteins. The prediction models for plant viral RNA silencing suppressors can potentially aid identification of novel viral RNA silencing suppressors, which will provide valuable insights into the mechanism of RNA silencing and could be further explored as potential targets for designing novel antiviral therapeutics. Also, the key subset of identified optimal features may help in determining compositional patterns in the viral proteins which are important determinants for RNA silencing suppressor activities. The best prediction model developed in the study is available as a freely accessible web server pVsupPred at http://bioinfo.icgeb.res.in/pvsup/. Public Library of Science 2014-05-14 /pmc/articles/PMC4020838/ /pubmed/24828116 http://dx.doi.org/10.1371/journal.pone.0097446 Text en © 2014 Jagga, Gupta http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Jagga, Zeenia
Gupta, Dinesh
Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title_full Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title_fullStr Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title_full_unstemmed Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title_short Supervised Learning Classification Models for Prediction of Plant Virus Encoded RNA Silencing Suppressors
title_sort supervised learning classification models for prediction of plant virus encoded rna silencing suppressors
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4020838/
https://www.ncbi.nlm.nih.gov/pubmed/24828116
http://dx.doi.org/10.1371/journal.pone.0097446
work_keys_str_mv AT jaggazeenia supervisedlearningclassificationmodelsforpredictionofplantvirusencodedrnasilencingsuppressors
AT guptadinesh supervisedlearningclassificationmodelsforpredictionofplantvirusencodedrnasilencingsuppressors