Cargando…

PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and oft...

Descripción completa

Detalles Bibliográficos
Autores principales: Manavalan, Balachandran, Shin, Tae H., Lee, Gwang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864850/
https://www.ncbi.nlm.nih.gov/pubmed/29616000
http://dx.doi.org/10.3389/fmicb.2018.00476
_version_ 1783308569839403008
author Manavalan, Balachandran
Shin, Tae H.
Lee, Gwang
author_facet Manavalan, Balachandran
Shin, Tae H.
Lee, Gwang
author_sort Manavalan, Balachandran
collection PubMed
description Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.
format Online
Article
Text
id pubmed-5864850
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-58648502018-04-03 PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine Manavalan, Balachandran Shin, Tae H. Lee, Gwang Front Microbiol Microbiology Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html. Frontiers Media S.A. 2018-03-16 /pmc/articles/PMC5864850/ /pubmed/29616000 http://dx.doi.org/10.3389/fmicb.2018.00476 Text en Copyright © 2018 Manavalan, Shin and Lee. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Manavalan, Balachandran
Shin, Tae H.
Lee, Gwang
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title_full PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title_fullStr PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title_full_unstemmed PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title_short PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
title_sort pvp-svm: sequence-based prediction of phage virion proteins using a support vector machine
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864850/
https://www.ncbi.nlm.nih.gov/pubmed/29616000
http://dx.doi.org/10.3389/fmicb.2018.00476
work_keys_str_mv AT manavalanbalachandran pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine
AT shintaeh pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine
AT leegwang pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine