Cargando…

EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention

Introduction: N4-acetylcytidine (ac4C) is a critical acetylation modification that has an essential function in protein translation and is associated with a number of human diseases. Methods: The process of identifying ac4C sites by biological experiments is too cumbersome and costly. And the perfor...

Descripción completa

Detalles Bibliográficos
Autores principales: Jia, Jianhua, Wei, Zhangying, Cao, Xiaojing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372626/
https://www.ncbi.nlm.nih.gov/pubmed/37519885
http://dx.doi.org/10.3389/fgene.2023.1232038
_version_ 1785078408894480384
author Jia, Jianhua
Wei, Zhangying
Cao, Xiaojing
author_facet Jia, Jianhua
Wei, Zhangying
Cao, Xiaojing
author_sort Jia, Jianhua
collection PubMed
description Introduction: N4-acetylcytidine (ac4C) is a critical acetylation modification that has an essential function in protein translation and is associated with a number of human diseases. Methods: The process of identifying ac4C sites by biological experiments is too cumbersome and costly. And the performance of several existing computational models needs to be improved. Therefore, we propose a new deep learning tool EMDL-ac4C to predict ac4C sites, which uses a simple one-hot encoding for a unbalanced dataset using a downsampled ensemble deep learning network to extract important features to identify ac4C sites. The base learner of this ensemble model consists of a modified DenseNet and Squeeze-and-Excitation Networks. In addition, we innovatively add a convolutional residual structure in parallel with the dense block to achieve the effect of two-layer feature extraction. Results: The average accuracy (Acc), mathews correlation coefficient (MCC), and area under the curve Area under curve of EMDL-ac4C on ten independent testing sets are 80.84%, 61.77%, and 87.94%, respectively. Discussion: Multiple experimental comparisons indicate that EMDL-ac4C outperforms existing predictors and it greatly improved the predictive performance of the ac4C sites. At the same time, EMDL-ac4C could provide a valuable reference for the next part of the study. The source code and experimental data are available at: https://github.com/13133989982/EMDLac4C.
format Online
Article
Text
id pubmed-10372626
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103726262023-07-28 EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention Jia, Jianhua Wei, Zhangying Cao, Xiaojing Front Genet Genetics Introduction: N4-acetylcytidine (ac4C) is a critical acetylation modification that has an essential function in protein translation and is associated with a number of human diseases. Methods: The process of identifying ac4C sites by biological experiments is too cumbersome and costly. And the performance of several existing computational models needs to be improved. Therefore, we propose a new deep learning tool EMDL-ac4C to predict ac4C sites, which uses a simple one-hot encoding for a unbalanced dataset using a downsampled ensemble deep learning network to extract important features to identify ac4C sites. The base learner of this ensemble model consists of a modified DenseNet and Squeeze-and-Excitation Networks. In addition, we innovatively add a convolutional residual structure in parallel with the dense block to achieve the effect of two-layer feature extraction. Results: The average accuracy (Acc), mathews correlation coefficient (MCC), and area under the curve Area under curve of EMDL-ac4C on ten independent testing sets are 80.84%, 61.77%, and 87.94%, respectively. Discussion: Multiple experimental comparisons indicate that EMDL-ac4C outperforms existing predictors and it greatly improved the predictive performance of the ac4C sites. At the same time, EMDL-ac4C could provide a valuable reference for the next part of the study. The source code and experimental data are available at: https://github.com/13133989982/EMDLac4C. Frontiers Media S.A. 2023-07-13 /pmc/articles/PMC10372626/ /pubmed/37519885 http://dx.doi.org/10.3389/fgene.2023.1232038 Text en Copyright © 2023 Jia, Wei and Cao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jia, Jianhua
Wei, Zhangying
Cao, Xiaojing
EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title_full EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title_fullStr EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title_full_unstemmed EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title_short EMDL-ac4C: identifying N4-acetylcytidine based on ensemble two-branch residual connection DenseNet and attention
title_sort emdl-ac4c: identifying n4-acetylcytidine based on ensemble two-branch residual connection densenet and attention
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372626/
https://www.ncbi.nlm.nih.gov/pubmed/37519885
http://dx.doi.org/10.3389/fgene.2023.1232038
work_keys_str_mv AT jiajianhua emdlac4cidentifyingn4acetylcytidinebasedonensembletwobranchresidualconnectiondensenetandattention
AT weizhangying emdlac4cidentifyingn4acetylcytidinebasedonensembletwobranchresidualconnectiondensenetandattention
AT caoxiaojing emdlac4cidentifyingn4acetylcytidinebasedonensembletwobranchresidualconnectiondensenetandattention