Cargando…
ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation
Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP ident...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279753/ https://www.ncbi.nlm.nih.gov/pubmed/34276801 http://dx.doi.org/10.3389/fgene.2021.698477 |
_version_ | 1783722510932508672 |
---|---|
author | Chen, Xian-gan Zhang, Wen Yang, Xiaofei Li, Chenhong Chen, Hengling |
author_facet | Chen, Xian-gan Zhang, Wen Yang, Xiaofei Li, Chenhong Chen, Hengling |
author_sort | Chen, Xian-gan |
collection | PubMed |
description | Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA. |
format | Online Article Text |
id | pubmed-8279753 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82797532021-07-15 ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation Chen, Xian-gan Zhang, Wen Yang, Xiaofei Li, Chenhong Chen, Hengling Front Genet Genetics Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA. Frontiers Media S.A. 2021-06-30 /pmc/articles/PMC8279753/ /pubmed/34276801 http://dx.doi.org/10.3389/fgene.2021.698477 Text en Copyright © 2021 Chen, Zhang, Yang, Li and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Chen, Xian-gan Zhang, Wen Yang, Xiaofei Li, Chenhong Chen, Hengling ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title | ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title_full | ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title_fullStr | ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title_full_unstemmed | ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title_short | ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation |
title_sort | acp-da: improving the prediction of anticancer peptides using data augmentation |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279753/ https://www.ncbi.nlm.nih.gov/pubmed/34276801 http://dx.doi.org/10.3389/fgene.2021.698477 |
work_keys_str_mv | AT chenxiangan acpdaimprovingthepredictionofanticancerpeptidesusingdataaugmentation AT zhangwen acpdaimprovingthepredictionofanticancerpeptidesusingdataaugmentation AT yangxiaofei acpdaimprovingthepredictionofanticancerpeptidesusingdataaugmentation AT lichenhong acpdaimprovingthepredictionofanticancerpeptidesusingdataaugmentation AT chenhengling acpdaimprovingthepredictionofanticancerpeptidesusingdataaugmentation |