Cargando…

Identification of DNA N(6)-methyladenine sites by integration of sequence features

BACKGROUND: An increasing number of nucleic acid modifications have been profiled with the development of sequencing technologies. DNA N(6)-methyladenine (6mA), which is a prevalent epigenetic modification, plays important roles in a series of biological processes. So far, identification of DNA 6mA...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Hao-Tian, Xiao, Fu-Hui, Li, Gong-Hua, Kong, Qing-Peng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038560/
https://www.ncbi.nlm.nih.gov/pubmed/32093759
http://dx.doi.org/10.1186/s13072-020-00330-2
_version_ 1783500668575678464
author Wang, Hao-Tian
Xiao, Fu-Hui
Li, Gong-Hua
Kong, Qing-Peng
author_facet Wang, Hao-Tian
Xiao, Fu-Hui
Li, Gong-Hua
Kong, Qing-Peng
author_sort Wang, Hao-Tian
collection PubMed
description BACKGROUND: An increasing number of nucleic acid modifications have been profiled with the development of sequencing technologies. DNA N(6)-methyladenine (6mA), which is a prevalent epigenetic modification, plays important roles in a series of biological processes. So far, identification of DNA 6mA relies primarily on time-consuming and expensive experimental approaches. However, in silico methods can be implemented to conduct preliminary screening to save experimental resources and time, especially given the rapid accumulation of sequencing data. RESULTS: In this study, we constructed a 6mA predictor, p6mA, from a series of sequence-based features, including physicochemical properties, position-specific triple-nucleotide propensity (PSTNP), and electron–ion interaction pseudopotential (EIIP). We performed maximum relevance maximum distance (MRMD) analysis to select key features and used the Extreme Gradient Boosting (XGBoost) algorithm to build our predictor. Results demonstrated that p6mA outperformed other existing predictors using different datasets. CONCLUSIONS: p6mA can predict the methylation status of DNA adenines, using only sequence files. It may be used as a tool to help the study of 6mA distribution pattern. Users can download it from https://github.com/Konglab404/p6mA.
format Online
Article
Text
id pubmed-7038560
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70385602020-03-02 Identification of DNA N(6)-methyladenine sites by integration of sequence features Wang, Hao-Tian Xiao, Fu-Hui Li, Gong-Hua Kong, Qing-Peng Epigenetics Chromatin Methodology BACKGROUND: An increasing number of nucleic acid modifications have been profiled with the development of sequencing technologies. DNA N(6)-methyladenine (6mA), which is a prevalent epigenetic modification, plays important roles in a series of biological processes. So far, identification of DNA 6mA relies primarily on time-consuming and expensive experimental approaches. However, in silico methods can be implemented to conduct preliminary screening to save experimental resources and time, especially given the rapid accumulation of sequencing data. RESULTS: In this study, we constructed a 6mA predictor, p6mA, from a series of sequence-based features, including physicochemical properties, position-specific triple-nucleotide propensity (PSTNP), and electron–ion interaction pseudopotential (EIIP). We performed maximum relevance maximum distance (MRMD) analysis to select key features and used the Extreme Gradient Boosting (XGBoost) algorithm to build our predictor. Results demonstrated that p6mA outperformed other existing predictors using different datasets. CONCLUSIONS: p6mA can predict the methylation status of DNA adenines, using only sequence files. It may be used as a tool to help the study of 6mA distribution pattern. Users can download it from https://github.com/Konglab404/p6mA. BioMed Central 2020-02-24 /pmc/articles/PMC7038560/ /pubmed/32093759 http://dx.doi.org/10.1186/s13072-020-00330-2 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Wang, Hao-Tian
Xiao, Fu-Hui
Li, Gong-Hua
Kong, Qing-Peng
Identification of DNA N(6)-methyladenine sites by integration of sequence features
title Identification of DNA N(6)-methyladenine sites by integration of sequence features
title_full Identification of DNA N(6)-methyladenine sites by integration of sequence features
title_fullStr Identification of DNA N(6)-methyladenine sites by integration of sequence features
title_full_unstemmed Identification of DNA N(6)-methyladenine sites by integration of sequence features
title_short Identification of DNA N(6)-methyladenine sites by integration of sequence features
title_sort identification of dna n(6)-methyladenine sites by integration of sequence features
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7038560/
https://www.ncbi.nlm.nih.gov/pubmed/32093759
http://dx.doi.org/10.1186/s13072-020-00330-2
work_keys_str_mv AT wanghaotian identificationofdnan6methyladeninesitesbyintegrationofsequencefeatures
AT xiaofuhui identificationofdnan6methyladeninesitesbyintegrationofsequencefeatures
AT ligonghua identificationofdnan6methyladeninesitesbyintegrationofsequencefeatures
AT kongqingpeng identificationofdnan6methyladeninesitesbyintegrationofsequencefeatures