Cargando…
A novel two-way rebalancing strategy for identifying carbonylation sites
BACKGROUND: As an irreversible post-translational modification, protein carbonylation is closely related to many diseases and aging. Protein carbonylation prediction for related patients is significant, which can help clinicians make appropriate therapeutic schemes. Because carbonylation sites can b...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644465/ https://www.ncbi.nlm.nih.gov/pubmed/37957582 http://dx.doi.org/10.1186/s12859-023-05551-2 |
_version_ | 1785147234554216448 |
---|---|
author | Chen, Linjun Jing, Xiao-Yuan Hao, Yaru Liu, Wei Zhu, Xiaoke Han, Wei |
author_facet | Chen, Linjun Jing, Xiao-Yuan Hao, Yaru Liu, Wei Zhu, Xiaoke Han, Wei |
author_sort | Chen, Linjun |
collection | PubMed |
description | BACKGROUND: As an irreversible post-translational modification, protein carbonylation is closely related to many diseases and aging. Protein carbonylation prediction for related patients is significant, which can help clinicians make appropriate therapeutic schemes. Because carbonylation sites can be used to indicate change or loss of protein function, integrating these protein carbonylation site data has been a promising method in prediction. Based on these protein carbonylation site data, some protein carbonylation prediction methods have been proposed. However, most data is highly class imbalanced, and the number of un-carbonylation sites greatly exceeds that of carbonylation sites. Unfortunately, existing methods have not addressed this issue adequately. RESULTS: In this work, we propose a novel two-way rebalancing strategy based on the attention technique and generative adversarial network (Carsite_AGan) for identifying protein carbonylation sites. Specifically, Carsite_AGan proposes a novel undersampling method based on attention technology that allows sites with high importance value to be selected from un-carbonylation sites. The attention technique can obtain the value of each sample’s importance. In the meanwhile, Carsite_AGan designs a generative adversarial network-based oversampling method to generate high-feasibility carbonylation sites. The generative adversarial network can generate high-feasibility samples through its generator and discriminator. Finally, we use a classifier like a nonlinear support vector machine to identify protein carbonylation sites. CONCLUSIONS: Experimental results demonstrate that our approach significantly outperforms other resampling methods. Using our approach to resampling carbonylation data can significantly improve the effect of identifying protein carbonylation sites. |
format | Online Article Text |
id | pubmed-10644465 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-106444652023-11-13 A novel two-way rebalancing strategy for identifying carbonylation sites Chen, Linjun Jing, Xiao-Yuan Hao, Yaru Liu, Wei Zhu, Xiaoke Han, Wei BMC Bioinformatics Research BACKGROUND: As an irreversible post-translational modification, protein carbonylation is closely related to many diseases and aging. Protein carbonylation prediction for related patients is significant, which can help clinicians make appropriate therapeutic schemes. Because carbonylation sites can be used to indicate change or loss of protein function, integrating these protein carbonylation site data has been a promising method in prediction. Based on these protein carbonylation site data, some protein carbonylation prediction methods have been proposed. However, most data is highly class imbalanced, and the number of un-carbonylation sites greatly exceeds that of carbonylation sites. Unfortunately, existing methods have not addressed this issue adequately. RESULTS: In this work, we propose a novel two-way rebalancing strategy based on the attention technique and generative adversarial network (Carsite_AGan) for identifying protein carbonylation sites. Specifically, Carsite_AGan proposes a novel undersampling method based on attention technology that allows sites with high importance value to be selected from un-carbonylation sites. The attention technique can obtain the value of each sample’s importance. In the meanwhile, Carsite_AGan designs a generative adversarial network-based oversampling method to generate high-feasibility carbonylation sites. The generative adversarial network can generate high-feasibility samples through its generator and discriminator. Finally, we use a classifier like a nonlinear support vector machine to identify protein carbonylation sites. CONCLUSIONS: Experimental results demonstrate that our approach significantly outperforms other resampling methods. Using our approach to resampling carbonylation data can significantly improve the effect of identifying protein carbonylation sites. BioMed Central 2023-11-13 /pmc/articles/PMC10644465/ /pubmed/37957582 http://dx.doi.org/10.1186/s12859-023-05551-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Chen, Linjun Jing, Xiao-Yuan Hao, Yaru Liu, Wei Zhu, Xiaoke Han, Wei A novel two-way rebalancing strategy for identifying carbonylation sites |
title | A novel two-way rebalancing strategy for identifying carbonylation sites |
title_full | A novel two-way rebalancing strategy for identifying carbonylation sites |
title_fullStr | A novel two-way rebalancing strategy for identifying carbonylation sites |
title_full_unstemmed | A novel two-way rebalancing strategy for identifying carbonylation sites |
title_short | A novel two-way rebalancing strategy for identifying carbonylation sites |
title_sort | novel two-way rebalancing strategy for identifying carbonylation sites |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644465/ https://www.ncbi.nlm.nih.gov/pubmed/37957582 http://dx.doi.org/10.1186/s12859-023-05551-2 |
work_keys_str_mv | AT chenlinjun anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT jingxiaoyuan anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT haoyaru anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT liuwei anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT zhuxiaoke anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT hanwei anoveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT chenlinjun noveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT jingxiaoyuan noveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT haoyaru noveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT liuwei noveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT zhuxiaoke noveltwowayrebalancingstrategyforidentifyingcarbonylationsites AT hanwei noveltwowayrebalancingstrategyforidentifyingcarbonylationsites |