Cargando…
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing no...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5035026/ https://www.ncbi.nlm.nih.gov/pubmed/27662651 http://dx.doi.org/10.1371/journal.pone.0163274 |
_version_ | 1782455366873776128 |
---|---|
author | Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing |
author_facet | Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing |
author_sort | Zhang, Lina |
collection | PubMed |
description | Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. |
format | Online Article Text |
id | pubmed-5035026 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-50350262016-10-10 Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing PLoS One Research Article Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. Public Library of Science 2016-09-23 /pmc/articles/PMC5035026/ /pubmed/27662651 http://dx.doi.org/10.1371/journal.pone.0163274 Text en © 2016 Zhang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title | Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title_full | Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title_fullStr | Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title_full_unstemmed | Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title_short | Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy |
title_sort | sequence based prediction of antioxidant proteins using a classifier selection strategy |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5035026/ https://www.ncbi.nlm.nih.gov/pubmed/27662651 http://dx.doi.org/10.1371/journal.pone.0163274 |
work_keys_str_mv | AT zhanglina sequencebasedpredictionofantioxidantproteinsusingaclassifierselectionstrategy AT zhangchengjin sequencebasedpredictionofantioxidantproteinsusingaclassifierselectionstrategy AT gaorui sequencebasedpredictionofantioxidantproteinsusingaclassifierselectionstrategy AT yangruntao sequencebasedpredictionofantioxidantproteinsusingaclassifierselectionstrategy AT songqing sequencebasedpredictionofantioxidantproteinsusingaclassifierselectionstrategy |