Cargando…

DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species

DNA N4-methylcytosine (4mC) being a significant genetic modification holds a dominant role in controlling different biological functions, i.e., DNA replication, DNA repair, gene regulations and gene expression levels. The identification of 4mC sites is important to get insight information regarding...

Descripción completa

Detalles Bibliográficos
Autores principales: Rehman, Mobeen Ur, Tayara, Hilal, Chong, Kil To
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8605313/
https://www.ncbi.nlm.nih.gov/pubmed/34849205
http://dx.doi.org/10.1016/j.csbj.2021.10.034
_version_ 1784602151884947456
author Rehman, Mobeen Ur
Tayara, Hilal
Chong, Kil To
author_facet Rehman, Mobeen Ur
Tayara, Hilal
Chong, Kil To
author_sort Rehman, Mobeen Ur
collection PubMed
description DNA N4-methylcytosine (4mC) being a significant genetic modification holds a dominant role in controlling different biological functions, i.e., DNA replication, DNA repair, gene regulations and gene expression levels. The identification of 4mC sites is important to get insight information regarding different organics mechanisms. However, getting modification prediction from experimental methods is a challenging task due to high expenses and time-consuming techniques. Therefore, computational tools can be a great option for modification identification. Various computational tools are proposed in literature but their generalization and prediction performance require improvement. For this motive, we have proposed a neural network based tool named DCNN-4mC for identifying 4mC sites. The proposed model involves a set of neural network layers with a skip connection which allows to share the shallow features with dense layers. Skip connection have allowed to gather crucial information regarding 4mC sites. In literature, different models are employed on different species hence in many cases different datasets are available for a single species. In this research, we have combined all available datasets to create a single benchmark dataset for every species. To the best of our knowledge, no model in literature is employed on more than six different species. To ensure the generalizability of DCNN-4mC we have used 12 different species for performance evaluation. The DCNN-4mC tool has attained 2% to 14% higher accuracy than state-of-the-art tools on all available datasets of different species. Furthermore, independent test datasets are also engaged and DCNN-4mC have overall yielded high performance in them as well.
format Online
Article
Text
id pubmed-8605313
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-86053132021-11-29 DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species Rehman, Mobeen Ur Tayara, Hilal Chong, Kil To Comput Struct Biotechnol J Research Article DNA N4-methylcytosine (4mC) being a significant genetic modification holds a dominant role in controlling different biological functions, i.e., DNA replication, DNA repair, gene regulations and gene expression levels. The identification of 4mC sites is important to get insight information regarding different organics mechanisms. However, getting modification prediction from experimental methods is a challenging task due to high expenses and time-consuming techniques. Therefore, computational tools can be a great option for modification identification. Various computational tools are proposed in literature but their generalization and prediction performance require improvement. For this motive, we have proposed a neural network based tool named DCNN-4mC for identifying 4mC sites. The proposed model involves a set of neural network layers with a skip connection which allows to share the shallow features with dense layers. Skip connection have allowed to gather crucial information regarding 4mC sites. In literature, different models are employed on different species hence in many cases different datasets are available for a single species. In this research, we have combined all available datasets to create a single benchmark dataset for every species. To the best of our knowledge, no model in literature is employed on more than six different species. To ensure the generalizability of DCNN-4mC we have used 12 different species for performance evaluation. The DCNN-4mC tool has attained 2% to 14% higher accuracy than state-of-the-art tools on all available datasets of different species. Furthermore, independent test datasets are also engaged and DCNN-4mC have overall yielded high performance in them as well. Research Network of Computational and Structural Biotechnology 2021-11-01 /pmc/articles/PMC8605313/ /pubmed/34849205 http://dx.doi.org/10.1016/j.csbj.2021.10.034 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Rehman, Mobeen Ur
Tayara, Hilal
Chong, Kil To
DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title_full DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title_fullStr DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title_full_unstemmed DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title_short DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species
title_sort dcnn-4mc: densely connected neural network based n4-methylcytosine site prediction in multiple species
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8605313/
https://www.ncbi.nlm.nih.gov/pubmed/34849205
http://dx.doi.org/10.1016/j.csbj.2021.10.034
work_keys_str_mv AT rehmanmobeenur dcnn4mcdenselyconnectedneuralnetworkbasedn4methylcytosinesitepredictioninmultiplespecies
AT tayarahilal dcnn4mcdenselyconnectedneuralnetworkbasedn4methylcytosinesitepredictioninmultiplespecies
AT chongkilto dcnn4mcdenselyconnectedneuralnetworkbasedn4methylcytosinesitepredictioninmultiplespecies