Cargando…
i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning
As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to understand biological functions. In the paper, we use...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8187051/ https://www.ncbi.nlm.nih.gov/pubmed/34159192 http://dx.doi.org/10.1155/2021/5515342 |
_version_ | 1783705066282156032 |
---|---|
author | Li, Yanjuan Zhao, Zhengnan Teng, Zhixia |
author_facet | Li, Yanjuan Zhao, Zhengnan Teng, Zhixia |
author_sort | Li, Yanjuan |
collection | PubMed |
description | As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to understand biological functions. In the paper, we use ensemble learning to develop a model named i4mC-EL to identify 4mC sites in the mouse genome. Firstly, a multifeature encoding scheme consisting of Kmer and EIIP was adopted to describe the DNA sequences. Secondly, on the basis of the multifeature encoding scheme, we developed a stacked ensemble model, in which four machine learning algorithms, namely, BayesNet, NaiveBayes, LibSVM, and Voted Perceptron, were utilized to implement an ensemble of base classifiers that produce intermediate results as input of the metaclassifier, Logistic. The experimental results on the independent test dataset demonstrate that the overall rate of predictive accurate of i4mC-EL is 82.19%, which is better than the existing methods. The user-friendly website implementing i4mC-EL can be accessed freely at the following. |
format | Online Article Text |
id | pubmed-8187051 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-81870512021-06-21 i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning Li, Yanjuan Zhao, Zhengnan Teng, Zhixia Biomed Res Int Research Article As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to understand biological functions. In the paper, we use ensemble learning to develop a model named i4mC-EL to identify 4mC sites in the mouse genome. Firstly, a multifeature encoding scheme consisting of Kmer and EIIP was adopted to describe the DNA sequences. Secondly, on the basis of the multifeature encoding scheme, we developed a stacked ensemble model, in which four machine learning algorithms, namely, BayesNet, NaiveBayes, LibSVM, and Voted Perceptron, were utilized to implement an ensemble of base classifiers that produce intermediate results as input of the metaclassifier, Logistic. The experimental results on the independent test dataset demonstrate that the overall rate of predictive accurate of i4mC-EL is 82.19%, which is better than the existing methods. The user-friendly website implementing i4mC-EL can be accessed freely at the following. Hindawi 2021-05-29 /pmc/articles/PMC8187051/ /pubmed/34159192 http://dx.doi.org/10.1155/2021/5515342 Text en Copyright © 2021 Yanjuan Li et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Li, Yanjuan Zhao, Zhengnan Teng, Zhixia i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title | i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title_full | i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title_fullStr | i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title_full_unstemmed | i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title_short | i4mC-EL: Identifying DNA N4-Methylcytosine Sites in the Mouse Genome Using Ensemble Learning |
title_sort | i4mc-el: identifying dna n4-methylcytosine sites in the mouse genome using ensemble learning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8187051/ https://www.ncbi.nlm.nih.gov/pubmed/34159192 http://dx.doi.org/10.1155/2021/5515342 |
work_keys_str_mv | AT liyanjuan i4mcelidentifyingdnan4methylcytosinesitesinthemousegenomeusingensemblelearning AT zhaozhengnan i4mcelidentifyingdnan4methylcytosinesitesinthemousegenomeusingensemblelearning AT tengzhixia i4mcelidentifyingdnan4methylcytosinesitesinthemousegenomeusingensemblelearning |