Cargando…

PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides

BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that off...

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, S. M. Ashiqul, Sajed, Tanvir, Kearney, Christopher Michel, Baker, Erich J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4491269/
https://www.ncbi.nlm.nih.gov/pubmed/26142484
http://dx.doi.org/10.1186/s12859-015-0633-x
_version_ 1782379615996608512
author Islam, S. M. Ashiqul
Sajed, Tanvir
Kearney, Christopher Michel
Baker, Erich J
author_facet Islam, S. M. Ashiqul
Sajed, Tanvir
Kearney, Christopher Michel
Baker, Erich J
author_sort Islam, S. M. Ashiqul
collection PubMed
description BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. RESULTS: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86 %, 94.11 %, 84.31 %, 94.30 % and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. CONCLUSION: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0633-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4491269
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44912692015-07-05 PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides Islam, S. M. Ashiqul Sajed, Tanvir Kearney, Christopher Michel Baker, Erich J BMC Bioinformatics Methodology Article BACKGROUND: Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. RESULTS: We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86 %, 94.11 %, 84.31 %, 94.30 % and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. CONCLUSION: PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0633-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-05 /pmc/articles/PMC4491269/ /pubmed/26142484 http://dx.doi.org/10.1186/s12859-015-0633-x Text en © Islam et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Islam, S. M. Ashiqul
Sajed, Tanvir
Kearney, Christopher Michel
Baker, Erich J
PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title_full PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title_fullStr PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title_full_unstemmed PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title_short PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides
title_sort predstp: a highly accurate svm based model to predict sequential cystine stabilized peptides
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4491269/
https://www.ncbi.nlm.nih.gov/pubmed/26142484
http://dx.doi.org/10.1186/s12859-015-0633-x
work_keys_str_mv AT islamsmashiqul predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides
AT sajedtanvir predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides
AT kearneychristophermichel predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides
AT bakererichj predstpahighlyaccuratesvmbasedmodeltopredictsequentialcystinestabilizedpeptides