Cargando…
PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that off...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4491269/ https://www.ncbi.nlm.nih.gov/pubmed/26142484 http://dx.doi.org/10.1186/s12859-015-0633-x |
_version_ | 1782379615996608512 |
---|---|
author | Islam, S. M. Ashiqul Sajed, Tanvir Kearney, Christopher Michel Baker, Erich J |
author_facet | Islam, S. M. Ashiqul Sajed, Tanvir Kearney, Christopher Michel Baker, Erich J |
author_sort | Islam, S. M. Ashiqul |
collection | PubMed |
description | BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. RESULTS: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86 %, 94.11 %, 84.31 %, 94.30 % and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. CONCLUSION: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0633-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4491269 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44912692015-07-05 PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides Islam, S. M. Ashiqul Sajed, Tanvir Kearney, Christopher Michel Baker, Erich J BMC Bioinformatics Methodology Article BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. RESULTS: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86 %, 94.11 %, 84.31 %, 94.30 % and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. CONCLUSION: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0633-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-05 /pmc/articles/PMC4491269/ /pubmed/26142484 http://dx.doi.org/10.1186/s12859-015-0633-x Text en © Islam et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Islam, S. M. Ashiqul Sajed, Tanvir Kearney, Christopher Michel Baker, Erich J PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title | PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title_full | PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title_fullStr | PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title_full_unstemmed | PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title_short | PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides |
title_sort | predstp: a highly accurate svm based model to predict sequential cystine stabilized peptides |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4491269/ https://www.ncbi.nlm.nih.gov/pubmed/26142484 http://dx.doi.org/10.1186/s12859-015-0633-x |
work_keys_str_mv | AT islamsmashiqul predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides AT sajedtanvir predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides AT kearneychristophermichel predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides AT bakererichj predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides |