Cargando…

Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space

Cell-penetrating peptides (CPPs) are naturally able to cross the lipid bilayer membrane that protects cells. These peptides share common structural and physicochemical properties and show different pharmaceutical applications, among which drug delivery is the most important. Due to their ability to...

Descripción completa

Detalles Bibliográficos
Autores principales: de Oliveira, Ewerton Cristhian Lima, Santana, Kauê, Josino, Luiz, Lima e Lima, Anderson Henrique, de Souza de Sales Júnior, Claudomiro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8027643/
https://www.ncbi.nlm.nih.gov/pubmed/33828175
http://dx.doi.org/10.1038/s41598-021-87134-w
_version_ 1783675848552873984
author de Oliveira, Ewerton Cristhian Lima
Santana, Kauê
Josino, Luiz
Lima e Lima, Anderson Henrique
de Souza de Sales Júnior, Claudomiro
author_facet de Oliveira, Ewerton Cristhian Lima
Santana, Kauê
Josino, Luiz
Lima e Lima, Anderson Henrique
de Souza de Sales Júnior, Claudomiro
author_sort de Oliveira, Ewerton Cristhian Lima
collection PubMed
description Cell-penetrating peptides (CPPs) are naturally able to cross the lipid bilayer membrane that protects cells. These peptides share common structural and physicochemical properties and show different pharmaceutical applications, among which drug delivery is the most important. Due to their ability to cross the membranes by pulling high-molecular-weight polar molecules, they are termed Trojan horses. In this study, we proposed a machine learning (ML)-based framework named BChemRF-CPPred (beyond chemical rules-based framework for CPP prediction) that uses an artificial neural network, a support vector machine, and a Gaussian process classifier to differentiate CPPs from non-CPPs, using structure- and sequence-based descriptors extracted from PDB and FASTA formats. The performance of our algorithm was evaluated by tenfold cross-validation and compared with those of previously reported prediction tools using an independent dataset. The BChemRF-CPPred satisfactorily identified CPP-like structures using natural and synthetic modified peptide libraries and also obtained better performance than those of previously reported ML-based algorithms, reaching the independent test accuracy of 90.66% (AUC = 0.9365) for PDB, and an accuracy of 86.5% (AUC = 0.9216) for FASTA input. Moreover, our analyses of the CPP chemical space demonstrated that these peptides break some molecular rules related to the prediction of permeability of therapeutic molecules in cell membranes. This is the first comprehensive analysis to predict synthetic and natural CPP structures and to evaluate their chemical space using an ML-based framework. Our algorithm is freely available for academic use at http://comptools.linc.ufpa.br/BChemRF-CPPred.
format Online
Article
Text
id pubmed-8027643
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80276432021-04-08 Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space de Oliveira, Ewerton Cristhian Lima Santana, Kauê Josino, Luiz Lima e Lima, Anderson Henrique de Souza de Sales Júnior, Claudomiro Sci Rep Article Cell-penetrating peptides (CPPs) are naturally able to cross the lipid bilayer membrane that protects cells. These peptides share common structural and physicochemical properties and show different pharmaceutical applications, among which drug delivery is the most important. Due to their ability to cross the membranes by pulling high-molecular-weight polar molecules, they are termed Trojan horses. In this study, we proposed a machine learning (ML)-based framework named BChemRF-CPPred (beyond chemical rules-based framework for CPP prediction) that uses an artificial neural network, a support vector machine, and a Gaussian process classifier to differentiate CPPs from non-CPPs, using structure- and sequence-based descriptors extracted from PDB and FASTA formats. The performance of our algorithm was evaluated by tenfold cross-validation and compared with those of previously reported prediction tools using an independent dataset. The BChemRF-CPPred satisfactorily identified CPP-like structures using natural and synthetic modified peptide libraries and also obtained better performance than those of previously reported ML-based algorithms, reaching the independent test accuracy of 90.66% (AUC = 0.9365) for PDB, and an accuracy of 86.5% (AUC = 0.9216) for FASTA input. Moreover, our analyses of the CPP chemical space demonstrated that these peptides break some molecular rules related to the prediction of permeability of therapeutic molecules in cell membranes. This is the first comprehensive analysis to predict synthetic and natural CPP structures and to evaluate their chemical space using an ML-based framework. Our algorithm is freely available for academic use at http://comptools.linc.ufpa.br/BChemRF-CPPred. Nature Publishing Group UK 2021-04-07 /pmc/articles/PMC8027643/ /pubmed/33828175 http://dx.doi.org/10.1038/s41598-021-87134-w Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
de Oliveira, Ewerton Cristhian Lima
Santana, Kauê
Josino, Luiz
Lima e Lima, Anderson Henrique
de Souza de Sales Júnior, Claudomiro
Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title_full Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title_fullStr Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title_full_unstemmed Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title_short Predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
title_sort predicting cell-penetrating peptides using machine learning algorithms and navigating in their chemical space
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8027643/
https://www.ncbi.nlm.nih.gov/pubmed/33828175
http://dx.doi.org/10.1038/s41598-021-87134-w
work_keys_str_mv AT deoliveiraewertoncristhianlima predictingcellpenetratingpeptidesusingmachinelearningalgorithmsandnavigatingintheirchemicalspace
AT santanakaue predictingcellpenetratingpeptidesusingmachinelearningalgorithmsandnavigatingintheirchemicalspace
AT josinoluiz predictingcellpenetratingpeptidesusingmachinelearningalgorithmsandnavigatingintheirchemicalspace
AT limaelimaandersonhenrique predictingcellpenetratingpeptidesusingmachinelearningalgorithmsandnavigatingintheirchemicalspace
AT desouzadesalesjuniorclaudomiro predictingcellpenetratingpeptidesusingmachinelearningalgorithmsandnavigatingintheirchemicalspace