Cargando…

An Augmented Sample Selection Framework for Prediction of Anticancer Peptides

Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven p...

Descripción completa

Detalles Bibliográficos
Autores principales: Tao, Huawei, Shan, Shuai, Fu, Hongliang, Zhu, Chunhua, Liu, Boye
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10535447/
https://www.ncbi.nlm.nih.gov/pubmed/37764455
http://dx.doi.org/10.3390/molecules28186680
_version_ 1785112631524196352
author Tao, Huawei
Shan, Shuai
Fu, Hongliang
Zhu, Chunhua
Liu, Boye
author_facet Tao, Huawei
Shan, Shuai
Fu, Hongliang
Zhu, Chunhua
Liu, Boye
author_sort Tao, Huawei
collection PubMed
description Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven prediction models rely heavily on extensive training data. Furthermore, the current publicly accessible ACP dataset is limited in size, leading to inadequate model generalization. While data augmentation effectively expands dataset size, existing techniques for augmenting ACP data often generate noisy samples, adversely affecting prediction performance. Therefore, this paper proposes a novel augmented sample selection framework for the prediction of anticancer peptides (ACPs-ASSF). First, the prediction model is trained using raw data. Then, the augmented samples generated using the data augmentation technique are fed into the trained model to compute pseudo-labels and estimate the uncertainty of the model prediction. Finally, samples with low uncertainty, high confidence, and pseudo-labels consistent with the original labels are selected and incorporated into the training set to retrain the model. The evaluation results for the ACP240 and ACP740 datasets show that ACPs-ASSF achieved accuracy improvements of up to 5.41% and 5.68%, respectively, compared to the traditional data augmentation method.
format Online
Article
Text
id pubmed-10535447
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105354472023-09-29 An Augmented Sample Selection Framework for Prediction of Anticancer Peptides Tao, Huawei Shan, Shuai Fu, Hongliang Zhu, Chunhua Liu, Boye Molecules Article Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven prediction models rely heavily on extensive training data. Furthermore, the current publicly accessible ACP dataset is limited in size, leading to inadequate model generalization. While data augmentation effectively expands dataset size, existing techniques for augmenting ACP data often generate noisy samples, adversely affecting prediction performance. Therefore, this paper proposes a novel augmented sample selection framework for the prediction of anticancer peptides (ACPs-ASSF). First, the prediction model is trained using raw data. Then, the augmented samples generated using the data augmentation technique are fed into the trained model to compute pseudo-labels and estimate the uncertainty of the model prediction. Finally, samples with low uncertainty, high confidence, and pseudo-labels consistent with the original labels are selected and incorporated into the training set to retrain the model. The evaluation results for the ACP240 and ACP740 datasets show that ACPs-ASSF achieved accuracy improvements of up to 5.41% and 5.68%, respectively, compared to the traditional data augmentation method. MDPI 2023-09-18 /pmc/articles/PMC10535447/ /pubmed/37764455 http://dx.doi.org/10.3390/molecules28186680 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Tao, Huawei
Shan, Shuai
Fu, Hongliang
Zhu, Chunhua
Liu, Boye
An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title_full An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title_fullStr An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title_full_unstemmed An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title_short An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
title_sort augmented sample selection framework for prediction of anticancer peptides
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10535447/
https://www.ncbi.nlm.nih.gov/pubmed/37764455
http://dx.doi.org/10.3390/molecules28186680
work_keys_str_mv AT taohuawei anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT shanshuai anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT fuhongliang anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT zhuchunhua anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT liuboye anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT taohuawei augmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT shanshuai augmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT fuhongliang augmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT zhuchunhua augmentedsampleselectionframeworkforpredictionofanticancerpeptides
AT liuboye augmentedsampleselectionframeworkforpredictionofanticancerpeptides