Cargando…

Active Semisupervised Model for Improving the Identification of Anticancer Peptides

[Image: see text] Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accurac...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Lijun, Wang, Li, Fu, Xiangzheng, Zeng, Xiangxiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459422/
https://www.ncbi.nlm.nih.gov/pubmed/34568678
http://dx.doi.org/10.1021/acsomega.1c03132
_version_ 1784571517292511232
author Cai, Lijun
Wang, Li
Fu, Xiangzheng
Zeng, Xiangxiang
author_facet Cai, Lijun
Wang, Li
Fu, Xiangzheng
Zeng, Xiangxiang
author_sort Cai, Lijun
collection PubMed
description [Image: see text] Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work.
format Online
Article
Text
id pubmed-8459422
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-84594222021-09-24 Active Semisupervised Model for Improving the Identification of Anticancer Peptides Cai, Lijun Wang, Li Fu, Xiangzheng Zeng, Xiangxiang ACS Omega [Image: see text] Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work. American Chemical Society 2021-09-08 /pmc/articles/PMC8459422/ /pubmed/34568678 http://dx.doi.org/10.1021/acsomega.1c03132 Text en © 2021 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Cai, Lijun
Wang, Li
Fu, Xiangzheng
Zeng, Xiangxiang
Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title_full Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title_fullStr Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title_full_unstemmed Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title_short Active Semisupervised Model for Improving the Identification of Anticancer Peptides
title_sort active semisupervised model for improving the identification of anticancer peptides
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459422/
https://www.ncbi.nlm.nih.gov/pubmed/34568678
http://dx.doi.org/10.1021/acsomega.1c03132
work_keys_str_mv AT cailijun activesemisupervisedmodelforimprovingtheidentificationofanticancerpeptides
AT wangli activesemisupervisedmodelforimprovingtheidentificationofanticancerpeptides
AT fuxiangzheng activesemisupervisedmodelforimprovingtheidentificationofanticancerpeptides
AT zengxiangxiang activesemisupervisedmodelforimprovingtheidentificationofanticancerpeptides