Cargando…
SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides
BACKGROUND: Cell-penetrating peptides (CPPs) are short peptides (5–30 amino acids) that can enter almost any cell without significant damage. On account of their high delivery efficiency, CPPs are promising candidates for gene therapy and cancer treatment. Accordingly, techniques that correctly pred...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5657092/ https://www.ncbi.nlm.nih.gov/pubmed/29513192 http://dx.doi.org/10.1186/s12864-017-4128-1 |
_version_ | 1783273819222310912 |
---|---|
author | Wei, Leyi Tang, Jijun Zou, Quan |
author_facet | Wei, Leyi Tang, Jijun Zou, Quan |
author_sort | Wei, Leyi |
collection | PubMed |
description | BACKGROUND: Cell-penetrating peptides (CPPs) are short peptides (5–30 amino acids) that can enter almost any cell without significant damage. On account of their high delivery efficiency, CPPs are promising candidates for gene therapy and cancer treatment. Accordingly, techniques that correctly predict CPPs are anticipated to accelerate CPP applications in future therapeutics. Recently, computational methods have been reportedly successful in predicting CPPs. Unfortunately, the predictive performance of existing methods is not satisfactory and reliable so as to accurately identify CPPs. RESULTS: In this study, we propose a novel computational predictor called SkipCPP-Pred to further improve the predictive performance. The novelty of the proposed predictor is that we present a sequence-based feature representation algorithm called adaptive k-skip-n-gram that sufficiently captures the intrinsic correlation information of residues. By fusing the proposed adaptive skip features with a random forest (RF) classifier, we successfully construct the prediction model of SkipCPP-Pred. The various jackknife results demonstrate that the proposed SkipCPP-Pred is 3.6% higher than state-of-the-art CPP predictors in terms of accuracy. Moreover, we construct a high-quality benchmark dataset by reducing the data redundancy and enhancing the similarity between the positive and negative classes. Using this dataset to build prediction models, we can successfully avoid the performance bias lying in existing methods and yield a promising predictive model. CONCLUSIONS: The proposed SkipCPP-Pred is a simple and fast sequence-based predictor featured with the adaptive k-skip-n-gram model for the improved prediction of CPPs. Currently, SkipCPP-Pred is publicly available from an online webserver (http://server.malab.cn/SkipCPP-Pred/Index.html). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-017-4128-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5657092 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56570922017-10-31 SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides Wei, Leyi Tang, Jijun Zou, Quan BMC Genomics Research BACKGROUND: Cell-penetrating peptides (CPPs) are short peptides (5–30 amino acids) that can enter almost any cell without significant damage. On account of their high delivery efficiency, CPPs are promising candidates for gene therapy and cancer treatment. Accordingly, techniques that correctly predict CPPs are anticipated to accelerate CPP applications in future therapeutics. Recently, computational methods have been reportedly successful in predicting CPPs. Unfortunately, the predictive performance of existing methods is not satisfactory and reliable so as to accurately identify CPPs. RESULTS: In this study, we propose a novel computational predictor called SkipCPP-Pred to further improve the predictive performance. The novelty of the proposed predictor is that we present a sequence-based feature representation algorithm called adaptive k-skip-n-gram that sufficiently captures the intrinsic correlation information of residues. By fusing the proposed adaptive skip features with a random forest (RF) classifier, we successfully construct the prediction model of SkipCPP-Pred. The various jackknife results demonstrate that the proposed SkipCPP-Pred is 3.6% higher than state-of-the-art CPP predictors in terms of accuracy. Moreover, we construct a high-quality benchmark dataset by reducing the data redundancy and enhancing the similarity between the positive and negative classes. Using this dataset to build prediction models, we can successfully avoid the performance bias lying in existing methods and yield a promising predictive model. CONCLUSIONS: The proposed SkipCPP-Pred is a simple and fast sequence-based predictor featured with the adaptive k-skip-n-gram model for the improved prediction of CPPs. Currently, SkipCPP-Pred is publicly available from an online webserver (http://server.malab.cn/SkipCPP-Pred/Index.html). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-017-4128-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-16 /pmc/articles/PMC5657092/ /pubmed/29513192 http://dx.doi.org/10.1186/s12864-017-4128-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Wei, Leyi Tang, Jijun Zou, Quan SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title | SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title_full | SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title_fullStr | SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title_full_unstemmed | SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title_short | SkipCPP-Pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
title_sort | skipcpp-pred: an improved and promising sequence-based predictor for predicting cell-penetrating peptides |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5657092/ https://www.ncbi.nlm.nih.gov/pubmed/29513192 http://dx.doi.org/10.1186/s12864-017-4128-1 |
work_keys_str_mv | AT weileyi skipcpppredanimprovedandpromisingsequencebasedpredictorforpredictingcellpenetratingpeptides AT tangjijun skipcpppredanimprovedandpromisingsequencebasedpredictorforpredictingcellpenetratingpeptides AT zouquan skipcpppredanimprovedandpromisingsequencebasedpredictorforpredictingcellpenetratingpeptides |