Cargando…

MLACP 2.0: An updated machine learning tool for anticancer peptide prediction

Anticancer peptides are emerging anticancer drug that offers fewer side effects and is more effective than chemotherapy and targeted therapy. Predicting anticancer peptides from sequence information is one of the most challenging tasks in immunoinformatics. In the past ten years, machine learning-ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Thi Phan, Le, Woo Park, Hyun, Pitti, Thejkiran, Madhavan, Thirumurthy, Jeon, Young-Jun, Manavalan, Balachandran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9421197/
https://www.ncbi.nlm.nih.gov/pubmed/36051870
http://dx.doi.org/10.1016/j.csbj.2022.07.043
_version_ 1784777544423178240
author Thi Phan, Le
Woo Park, Hyun
Pitti, Thejkiran
Madhavan, Thirumurthy
Jeon, Young-Jun
Manavalan, Balachandran
author_facet Thi Phan, Le
Woo Park, Hyun
Pitti, Thejkiran
Madhavan, Thirumurthy
Jeon, Young-Jun
Manavalan, Balachandran
author_sort Thi Phan, Le
collection PubMed
description Anticancer peptides are emerging anticancer drug that offers fewer side effects and is more effective than chemotherapy and targeted therapy. Predicting anticancer peptides from sequence information is one of the most challenging tasks in immunoinformatics. In the past ten years, machine learning-based approaches have been proposed for identifying ACP activity from peptide sequences. These methods include our previous method MLACP (developed in 2017) which made a significant impact on anticancer research. MLACP tool has been widely used by the research community, however, its robustness must be improved significantly for its continued practical application. In this study, the first large non-redundant training and independent datasets were constructed for ACP research. Using the training dataset, the study explored a wide range of feature encodings and developed their respective models using seven different conventional classifiers. Subsequently, a subset of encoding-based models was selected for each classifier based on their performance, whose predicted scores were concatenated and trained through a convolutional neural network (CNN), whose corresponding predictor is named MLACP 2.0. The evaluation of MLACP 2.0 with a very diverse independent dataset showed excellent performance and significantly outperformed the recent ACP prediction tools. Additionally, MLACP 2.0 exhibits superior performance during cross-validation and independent assessment when compared to CNN-based embedding models and conventional single models. Consequently, we anticipate that our proposed MLACP 2.0 will facilitate the design of hypothesis-driven experiments by making it easier to discover novel ACPs. The MLACP 2.0 is freely available at https://balalab-skku.org/mlacp2.
format Online
Article
Text
id pubmed-9421197
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-94211972022-08-31 MLACP 2.0: An updated machine learning tool for anticancer peptide prediction Thi Phan, Le Woo Park, Hyun Pitti, Thejkiran Madhavan, Thirumurthy Jeon, Young-Jun Manavalan, Balachandran Comput Struct Biotechnol J Research Article Anticancer peptides are emerging anticancer drug that offers fewer side effects and is more effective than chemotherapy and targeted therapy. Predicting anticancer peptides from sequence information is one of the most challenging tasks in immunoinformatics. In the past ten years, machine learning-based approaches have been proposed for identifying ACP activity from peptide sequences. These methods include our previous method MLACP (developed in 2017) which made a significant impact on anticancer research. MLACP tool has been widely used by the research community, however, its robustness must be improved significantly for its continued practical application. In this study, the first large non-redundant training and independent datasets were constructed for ACP research. Using the training dataset, the study explored a wide range of feature encodings and developed their respective models using seven different conventional classifiers. Subsequently, a subset of encoding-based models was selected for each classifier based on their performance, whose predicted scores were concatenated and trained through a convolutional neural network (CNN), whose corresponding predictor is named MLACP 2.0. The evaluation of MLACP 2.0 with a very diverse independent dataset showed excellent performance and significantly outperformed the recent ACP prediction tools. Additionally, MLACP 2.0 exhibits superior performance during cross-validation and independent assessment when compared to CNN-based embedding models and conventional single models. Consequently, we anticipate that our proposed MLACP 2.0 will facilitate the design of hypothesis-driven experiments by making it easier to discover novel ACPs. The MLACP 2.0 is freely available at https://balalab-skku.org/mlacp2. Research Network of Computational and Structural Biotechnology 2022-08-02 /pmc/articles/PMC9421197/ /pubmed/36051870 http://dx.doi.org/10.1016/j.csbj.2022.07.043 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Thi Phan, Le
Woo Park, Hyun
Pitti, Thejkiran
Madhavan, Thirumurthy
Jeon, Young-Jun
Manavalan, Balachandran
MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title_full MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title_fullStr MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title_full_unstemmed MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title_short MLACP 2.0: An updated machine learning tool for anticancer peptide prediction
title_sort mlacp 2.0: an updated machine learning tool for anticancer peptide prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9421197/
https://www.ncbi.nlm.nih.gov/pubmed/36051870
http://dx.doi.org/10.1016/j.csbj.2022.07.043
work_keys_str_mv AT thiphanle mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction
AT wooparkhyun mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction
AT pittithejkiran mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction
AT madhavanthirumurthy mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction
AT jeonyoungjun mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction
AT manavalanbalachandran mlacp20anupdatedmachinelearningtoolforanticancerpeptideprediction