Cargando…
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and oft...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864850/ https://www.ncbi.nlm.nih.gov/pubmed/29616000 http://dx.doi.org/10.3389/fmicb.2018.00476 |
_version_ | 1783308569839403008 |
---|---|
author | Manavalan, Balachandran Shin, Tae H. Lee, Gwang |
author_facet | Manavalan, Balachandran Shin, Tae H. Lee, Gwang |
author_sort | Manavalan, Balachandran |
collection | PubMed |
description | Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html. |
format | Online Article Text |
id | pubmed-5864850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-58648502018-04-03 PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine Manavalan, Balachandran Shin, Tae H. Lee, Gwang Front Microbiol Microbiology Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html. Frontiers Media S.A. 2018-03-16 /pmc/articles/PMC5864850/ /pubmed/29616000 http://dx.doi.org/10.3389/fmicb.2018.00476 Text en Copyright © 2018 Manavalan, Shin and Lee. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Manavalan, Balachandran Shin, Tae H. Lee, Gwang PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title | PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title_full | PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title_fullStr | PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title_full_unstemmed | PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title_short | PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine |
title_sort | pvp-svm: sequence-based prediction of phage virion proteins using a support vector machine |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864850/ https://www.ncbi.nlm.nih.gov/pubmed/29616000 http://dx.doi.org/10.3389/fmicb.2018.00476 |
work_keys_str_mv | AT manavalanbalachandran pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine AT shintaeh pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine AT leegwang pvpsvmsequencebasedpredictionofphagevirionproteinsusingasupportvectormachine |