Cargando…

Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning

DNA N(4)-methylcytosine (4mC) is a pivotal epigenetic modification that plays an essential role in DNA replication, repair, expression and differentiation. To gain insight into the biological functions of 4mC, it is critical to identify their modification sites in the genomics. Recently, deep learni...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Lezheng, Zhang, Yonglin, Xue, Li, Liu, Fengjuan, Chen, Qi, Luo, Jiesi, Jing, Runyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8989013/
https://www.ncbi.nlm.nih.gov/pubmed/35401453
http://dx.doi.org/10.3389/fmicb.2022.843425
_version_ 1784683077530812416
author Yu, Lezheng
Zhang, Yonglin
Xue, Li
Liu, Fengjuan
Chen, Qi
Luo, Jiesi
Jing, Runyu
author_facet Yu, Lezheng
Zhang, Yonglin
Xue, Li
Liu, Fengjuan
Chen, Qi
Luo, Jiesi
Jing, Runyu
author_sort Yu, Lezheng
collection PubMed
description DNA N(4)-methylcytosine (4mC) is a pivotal epigenetic modification that plays an essential role in DNA replication, repair, expression and differentiation. To gain insight into the biological functions of 4mC, it is critical to identify their modification sites in the genomics. Recently, deep learning has become increasingly popular in recent years and frequently employed for the 4mC site identification. However, a systematic analysis of how to build predictive models using deep learning techniques is still lacking. In this work, we first summarized all existing deep learning-based predictors and systematically analyzed their models, features and datasets, etc. Then, using a typical standard dataset with three species (A. thaliana, C. elegans, and D. melanogaster), we assessed the contribution of different model architectures, encoding methods and the attention mechanism in establishing a deep learning-based model for the 4mC site prediction. After a series of optimizations, convolutional-recurrent neural network architecture using the one-hot encoding and attention mechanism achieved the best overall prediction performance. Extensive comparison experiments were conducted based on the same dataset. This work will be helpful for researchers who would like to build the 4mC prediction models using deep learning in the future.
format Online
Article
Text
id pubmed-8989013
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89890132022-04-08 Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning Yu, Lezheng Zhang, Yonglin Xue, Li Liu, Fengjuan Chen, Qi Luo, Jiesi Jing, Runyu Front Microbiol Microbiology DNA N(4)-methylcytosine (4mC) is a pivotal epigenetic modification that plays an essential role in DNA replication, repair, expression and differentiation. To gain insight into the biological functions of 4mC, it is critical to identify their modification sites in the genomics. Recently, deep learning has become increasingly popular in recent years and frequently employed for the 4mC site identification. However, a systematic analysis of how to build predictive models using deep learning techniques is still lacking. In this work, we first summarized all existing deep learning-based predictors and systematically analyzed their models, features and datasets, etc. Then, using a typical standard dataset with three species (A. thaliana, C. elegans, and D. melanogaster), we assessed the contribution of different model architectures, encoding methods and the attention mechanism in establishing a deep learning-based model for the 4mC site prediction. After a series of optimizations, convolutional-recurrent neural network architecture using the one-hot encoding and attention mechanism achieved the best overall prediction performance. Extensive comparison experiments were conducted based on the same dataset. This work will be helpful for researchers who would like to build the 4mC prediction models using deep learning in the future. Frontiers Media S.A. 2022-03-15 /pmc/articles/PMC8989013/ /pubmed/35401453 http://dx.doi.org/10.3389/fmicb.2022.843425 Text en Copyright © 2022 Yu, Zhang, Xue, Liu, Chen, Luo and Jing. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Yu, Lezheng
Zhang, Yonglin
Xue, Li
Liu, Fengjuan
Chen, Qi
Luo, Jiesi
Jing, Runyu
Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title_full Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title_fullStr Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title_full_unstemmed Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title_short Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning
title_sort systematic analysis and accurate identification of dna n4-methylcytosine sites by deep learning
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8989013/
https://www.ncbi.nlm.nih.gov/pubmed/35401453
http://dx.doi.org/10.3389/fmicb.2022.843425
work_keys_str_mv AT yulezheng systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT zhangyonglin systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT xueli systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT liufengjuan systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT chenqi systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT luojiesi systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning
AT jingrunyu systematicanalysisandaccurateidentificationofdnan4methylcytosinesitesbydeeplearning