Cargando…

DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine

N4-methylcytosine is a biochemical alteration of DNA that affects the genetic operations without modifying the DNA nucleotides such as gene expression, genomic imprinting, chromosome stability, and the development of the cell. In the proposed work, a computational model, 4mCNLP-Deep, used the word e...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wahab, Abdul, Tayara, Hilal, Xuan, Zhenyu, Chong, Kil To
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794489/ https://www.ncbi.nlm.nih.gov/pubmed/33420191 http://dx.doi.org/10.1038/s41598-020-80430-x

_version_	1783634220899368960
author	Wahab, Abdul Tayara, Hilal Xuan, Zhenyu Chong, Kil To
author_facet	Wahab, Abdul Tayara, Hilal Xuan, Zhenyu Chong, Kil To
author_sort	Wahab, Abdul
collection	PubMed
description	N4-methylcytosine is a biochemical alteration of DNA that affects the genetic operations without modifying the DNA nucleotides such as gene expression, genomic imprinting, chromosome stability, and the development of the cell. In the proposed work, a computational model, 4mCNLP-Deep, used the word embedding approach as a vector formulation by exploiting deep learning based CNN algorithm to predict 4mC and non-4mC sites on the C.elegans genome dataset. Diversity of ranges employed for the experimental such as corpus k-mer and k-fold cross-validation to obtain the prevailing capabilities. The 4mCNLP-Deep outperform from the state-of-the-art predictor by achieving the results in five evaluation metrics by following; Accuracy (ACC) as 0.9354, Mathew’s correlation coefficient (MCC) as 0.8608, Specificity (Sp) as 0.89.96, Sensitivity (Sn) as 0.9563, and Area under curve (AUC) as 0.9731 by using 3-mer corpus word2vec and 3-fold cross-validation and attained the increment of 1.1%, 0.6%, 0.58%, 0.77%, and 4.89%, respectively. At last, we developed the online webserver http://nsclbio.jbnu.ac.kr/tools/4mCNLP-Deep/, for the experimental researchers to get the results easily.
format	Online Article Text
id	pubmed-7794489
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-77944892021-01-12 DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine Wahab, Abdul Tayara, Hilal Xuan, Zhenyu Chong, Kil To Sci Rep Article N4-methylcytosine is a biochemical alteration of DNA that affects the genetic operations without modifying the DNA nucleotides such as gene expression, genomic imprinting, chromosome stability, and the development of the cell. In the proposed work, a computational model, 4mCNLP-Deep, used the word embedding approach as a vector formulation by exploiting deep learning based CNN algorithm to predict 4mC and non-4mC sites on the C.elegans genome dataset. Diversity of ranges employed for the experimental such as corpus k-mer and k-fold cross-validation to obtain the prevailing capabilities. The 4mCNLP-Deep outperform from the state-of-the-art predictor by achieving the results in five evaluation metrics by following; Accuracy (ACC) as 0.9354, Mathew’s correlation coefficient (MCC) as 0.8608, Specificity (Sp) as 0.89.96, Sensitivity (Sn) as 0.9563, and Area under curve (AUC) as 0.9731 by using 3-mer corpus word2vec and 3-fold cross-validation and attained the increment of 1.1%, 0.6%, 0.58%, 0.77%, and 4.89%, respectively. At last, we developed the online webserver http://nsclbio.jbnu.ac.kr/tools/4mCNLP-Deep/, for the experimental researchers to get the results easily. Nature Publishing Group UK 2021-01-08 /pmc/articles/PMC7794489/ /pubmed/33420191 http://dx.doi.org/10.1038/s41598-020-80430-x Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Wahab, Abdul Tayara, Hilal Xuan, Zhenyu Chong, Kil To DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title	DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title_full	DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title_fullStr	DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title_full_unstemmed	DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title_short	DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine
title_sort	dna sequences performs as natural language processing by exploiting deep learning algorithm for the identification of n4-methylcytosine
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794489/ https://www.ncbi.nlm.nih.gov/pubmed/33420191 http://dx.doi.org/10.1038/s41598-020-80430-x
work_keys_str_mv	AT wahababdul dnasequencesperformsasnaturallanguageprocessingbyexploitingdeeplearningalgorithmfortheidentificationofn4methylcytosine AT tayarahilal dnasequencesperformsasnaturallanguageprocessingbyexploitingdeeplearningalgorithmfortheidentificationofn4methylcytosine AT xuanzhenyu dnasequencesperformsasnaturallanguageprocessingbyexploitingdeeplearningalgorithmfortheidentificationofn4methylcytosine AT chongkilto dnasequencesperformsasnaturallanguageprocessingbyexploitingdeeplearningalgorithmfortheidentificationofn4methylcytosine

DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine

Ejemplares similares