Cargando…

Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools

DNA N(4)-methylcytosine (4mC) is a crucial epigenetic modification involved in various biological processes. Accurate genome-wide identification of these sites is critical for improving our understanding of their biological functions and mechanisms. As experimental methods for 4mC identification are...

Descripción completa

Detalles Bibliográficos
Autores principales: Manavalan, Balachandran, Hasan, Md. Mehedi, Basith, Shaherin, Gosu, Vijayakumar, Shin, Tae-Hwan, Lee, Gwang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Gene & Cell Therapy 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7533314/
https://www.ncbi.nlm.nih.gov/pubmed/33230445
http://dx.doi.org/10.1016/j.omtn.2020.09.010
_version_ 1783590105844285440
author Manavalan, Balachandran
Hasan, Md. Mehedi
Basith, Shaherin
Gosu, Vijayakumar
Shin, Tae-Hwan
Lee, Gwang
author_facet Manavalan, Balachandran
Hasan, Md. Mehedi
Basith, Shaherin
Gosu, Vijayakumar
Shin, Tae-Hwan
Lee, Gwang
author_sort Manavalan, Balachandran
collection PubMed
description DNA N(4)-methylcytosine (4mC) is a crucial epigenetic modification involved in various biological processes. Accurate genome-wide identification of these sites is critical for improving our understanding of their biological functions and mechanisms. As experimental methods for 4mC identification are tedious, expensive, and labor-intensive, several machine learning-based approaches have been developed for genome-wide detection of such sites in multiple species. However, the predictions projected by these tools are difficult to quantify and compare. To date, no systematic performance comparison of 4mC tools has been reported. The aim of this study was to compare and critically evaluate 12 publicly available 4mC site prediction tools according to species specificity, based on a huge independent validation dataset. The tools 4mCCNN (Escherichia coli), DNA4mC-LIP (Arabidopsis thaliana), iDNA-MS (Fragaria vesca), DNA4mC-LIP and 4mCCNN (Drosophila melanogaster), and four tools for Caenorhabditis elegans achieved excellent overall performance compared with their counterparts. However, none of the existing methods was suitable for Geoalkalibacter subterraneus, Geobacter pickeringii, and Mus musculus, thereby limiting their practical applicability. Model transferability to five species and non-transferability to three species are also discussed. The presented evaluation will assist researchers in selecting appropriate prediction tools that best suit their purpose and provide useful guidelines for the development of improved 4mC predictors in the future.
format Online
Article
Text
id pubmed-7533314
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Gene & Cell Therapy
record_format MEDLINE/PubMed
spelling pubmed-75333142020-10-16 Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools Manavalan, Balachandran Hasan, Md. Mehedi Basith, Shaherin Gosu, Vijayakumar Shin, Tae-Hwan Lee, Gwang Mol Ther Nucleic Acids Original Article DNA N(4)-methylcytosine (4mC) is a crucial epigenetic modification involved in various biological processes. Accurate genome-wide identification of these sites is critical for improving our understanding of their biological functions and mechanisms. As experimental methods for 4mC identification are tedious, expensive, and labor-intensive, several machine learning-based approaches have been developed for genome-wide detection of such sites in multiple species. However, the predictions projected by these tools are difficult to quantify and compare. To date, no systematic performance comparison of 4mC tools has been reported. The aim of this study was to compare and critically evaluate 12 publicly available 4mC site prediction tools according to species specificity, based on a huge independent validation dataset. The tools 4mCCNN (Escherichia coli), DNA4mC-LIP (Arabidopsis thaliana), iDNA-MS (Fragaria vesca), DNA4mC-LIP and 4mCCNN (Drosophila melanogaster), and four tools for Caenorhabditis elegans achieved excellent overall performance compared with their counterparts. However, none of the existing methods was suitable for Geoalkalibacter subterraneus, Geobacter pickeringii, and Mus musculus, thereby limiting their practical applicability. Model transferability to five species and non-transferability to three species are also discussed. The presented evaluation will assist researchers in selecting appropriate prediction tools that best suit their purpose and provide useful guidelines for the development of improved 4mC predictors in the future. American Society of Gene & Cell Therapy 2020-09-16 /pmc/articles/PMC7533314/ /pubmed/33230445 http://dx.doi.org/10.1016/j.omtn.2020.09.010 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Manavalan, Balachandran
Hasan, Md. Mehedi
Basith, Shaherin
Gosu, Vijayakumar
Shin, Tae-Hwan
Lee, Gwang
Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title_full Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title_fullStr Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title_full_unstemmed Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title_short Empirical Comparison and Analysis of Web-Based DNA N(4)-Methylcytosine Site Prediction Tools
title_sort empirical comparison and analysis of web-based dna n(4)-methylcytosine site prediction tools
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7533314/
https://www.ncbi.nlm.nih.gov/pubmed/33230445
http://dx.doi.org/10.1016/j.omtn.2020.09.010
work_keys_str_mv AT manavalanbalachandran empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools
AT hasanmdmehedi empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools
AT basithshaherin empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools
AT gosuvijayakumar empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools
AT shintaehwan empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools
AT leegwang empiricalcomparisonandanalysisofwebbaseddnan4methylcytosinesitepredictiontools