Cargando…
An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven p...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10535447/ https://www.ncbi.nlm.nih.gov/pubmed/37764455 http://dx.doi.org/10.3390/molecules28186680 |
_version_ | 1785112631524196352 |
---|---|
author | Tao, Huawei Shan, Shuai Fu, Hongliang Zhu, Chunhua Liu, Boye |
author_facet | Tao, Huawei Shan, Shuai Fu, Hongliang Zhu, Chunhua Liu, Boye |
author_sort | Tao, Huawei |
collection | PubMed |
description | Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven prediction models rely heavily on extensive training data. Furthermore, the current publicly accessible ACP dataset is limited in size, leading to inadequate model generalization. While data augmentation effectively expands dataset size, existing techniques for augmenting ACP data often generate noisy samples, adversely affecting prediction performance. Therefore, this paper proposes a novel augmented sample selection framework for the prediction of anticancer peptides (ACPs-ASSF). First, the prediction model is trained using raw data. Then, the augmented samples generated using the data augmentation technique are fed into the trained model to compute pseudo-labels and estimate the uncertainty of the model prediction. Finally, samples with low uncertainty, high confidence, and pseudo-labels consistent with the original labels are selected and incorporated into the training set to retrain the model. The evaluation results for the ACP240 and ACP740 datasets show that ACPs-ASSF achieved accuracy improvements of up to 5.41% and 5.68%, respectively, compared to the traditional data augmentation method. |
format | Online Article Text |
id | pubmed-10535447 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105354472023-09-29 An Augmented Sample Selection Framework for Prediction of Anticancer Peptides Tao, Huawei Shan, Shuai Fu, Hongliang Zhu, Chunhua Liu, Boye Molecules Article Anticancer peptides (ACPs) have promising prospects for cancer treatment. Traditional ACP identification experiments have the limitations of low efficiency and high cost. In recent years, data-driven deep learning techniques have shown significant potential for ACP prediction. However, data-driven prediction models rely heavily on extensive training data. Furthermore, the current publicly accessible ACP dataset is limited in size, leading to inadequate model generalization. While data augmentation effectively expands dataset size, existing techniques for augmenting ACP data often generate noisy samples, adversely affecting prediction performance. Therefore, this paper proposes a novel augmented sample selection framework for the prediction of anticancer peptides (ACPs-ASSF). First, the prediction model is trained using raw data. Then, the augmented samples generated using the data augmentation technique are fed into the trained model to compute pseudo-labels and estimate the uncertainty of the model prediction. Finally, samples with low uncertainty, high confidence, and pseudo-labels consistent with the original labels are selected and incorporated into the training set to retrain the model. The evaluation results for the ACP240 and ACP740 datasets show that ACPs-ASSF achieved accuracy improvements of up to 5.41% and 5.68%, respectively, compared to the traditional data augmentation method. MDPI 2023-09-18 /pmc/articles/PMC10535447/ /pubmed/37764455 http://dx.doi.org/10.3390/molecules28186680 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Tao, Huawei Shan, Shuai Fu, Hongliang Zhu, Chunhua Liu, Boye An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title | An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title_full | An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title_fullStr | An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title_full_unstemmed | An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title_short | An Augmented Sample Selection Framework for Prediction of Anticancer Peptides |
title_sort | augmented sample selection framework for prediction of anticancer peptides |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10535447/ https://www.ncbi.nlm.nih.gov/pubmed/37764455 http://dx.doi.org/10.3390/molecules28186680 |
work_keys_str_mv | AT taohuawei anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides AT shanshuai anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides AT fuhongliang anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides AT zhuchunhua anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides AT liuboye anaugmentedsampleselectionframeworkforpredictionofanticancerpeptides AT taohuawei augmentedsampleselectionframeworkforpredictionofanticancerpeptides AT shanshuai augmentedsampleselectionframeworkforpredictionofanticancerpeptides AT fuhongliang augmentedsampleselectionframeworkforpredictionofanticancerpeptides AT zhuchunhua augmentedsampleselectionframeworkforpredictionofanticancerpeptides AT liuboye augmentedsampleselectionframeworkforpredictionofanticancerpeptides |