Cargando…

Prediction of promoters and enhancers using multiple DNA methylation-associated features

BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Hwang, Woochang, Oliver, Verity F, Merbs, Shannath L, Zhu, Heng, Qian, Jiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474542/
https://www.ncbi.nlm.nih.gov/pubmed/26099324
http://dx.doi.org/10.1186/1471-2164-16-S7-S11
_version_ 1782377286704562176
author Hwang, Woochang
Oliver, Verity F
Merbs, Shannath L
Zhu, Heng
Qian, Jiang
author_facet Hwang, Woochang
Oliver, Verity F
Merbs, Shannath L
Zhu, Heng
Qian, Jiang
author_sort Hwang, Woochang
collection PubMed
description BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, these DNA sequence-based approaches do not reflect the tissue-specific nature of the regulatory regions. In this work, we propose to predict regulatory regions using multiple features derived from DNA methylation profile. RESULTS: We discovered several interesting features of the methylated CpG (mCpG) sites within regulatory regions. First, a hypomethylation status of CpGs within regulatory regions, compared to the genomic background methylation level, extended out >1000 bp from the center of the regulatory regions, demonstrating a high degree of correlation between the methylation statuses of neighboring mCpG sites. Second, when a regulatory region was inactive, as determined by histone mark differences between cell lines, methylation level of the mCpG site increased from a hypomethylated state to a hypermethylated state, the level of which was even higher than the genomic background. Third, a distinct set of sequence motifs was overrepresented surrounding mCpG sites within regulatory regions. Using 5 types of features derived from DNA methylation profiles, we were able to predict promoters and enhancers using machine-learning approach (support vector machine). The performances for prediction of promoters and enhancers are quite well, showing an area under the ROC curve (AUC) of 0.992 and 0.817, respectively, which is better than that simply based on methylation level, especially for prediction of enhancers. CONCLUSIONS: Our study suggests that DNA methylation features of mCpG sites can be used to predict regulatory regions.
format Online
Article
Text
id pubmed-4474542
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44745422015-06-25 Prediction of promoters and enhancers using multiple DNA methylation-associated features Hwang, Woochang Oliver, Verity F Merbs, Shannath L Zhu, Heng Qian, Jiang BMC Genomics Research BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, these DNA sequence-based approaches do not reflect the tissue-specific nature of the regulatory regions. In this work, we propose to predict regulatory regions using multiple features derived from DNA methylation profile. RESULTS: We discovered several interesting features of the methylated CpG (mCpG) sites within regulatory regions. First, a hypomethylation status of CpGs within regulatory regions, compared to the genomic background methylation level, extended out >1000 bp from the center of the regulatory regions, demonstrating a high degree of correlation between the methylation statuses of neighboring mCpG sites. Second, when a regulatory region was inactive, as determined by histone mark differences between cell lines, methylation level of the mCpG site increased from a hypomethylated state to a hypermethylated state, the level of which was even higher than the genomic background. Third, a distinct set of sequence motifs was overrepresented surrounding mCpG sites within regulatory regions. Using 5 types of features derived from DNA methylation profiles, we were able to predict promoters and enhancers using machine-learning approach (support vector machine). The performances for prediction of promoters and enhancers are quite well, showing an area under the ROC curve (AUC) of 0.992 and 0.817, respectively, which is better than that simply based on methylation level, especially for prediction of enhancers. CONCLUSIONS: Our study suggests that DNA methylation features of mCpG sites can be used to predict regulatory regions. BioMed Central 2015-06-11 /pmc/articles/PMC4474542/ /pubmed/26099324 http://dx.doi.org/10.1186/1471-2164-16-S7-S11 Text en Copyright © 2015 Hwang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Hwang, Woochang
Oliver, Verity F
Merbs, Shannath L
Zhu, Heng
Qian, Jiang
Prediction of promoters and enhancers using multiple DNA methylation-associated features
title Prediction of promoters and enhancers using multiple DNA methylation-associated features
title_full Prediction of promoters and enhancers using multiple DNA methylation-associated features
title_fullStr Prediction of promoters and enhancers using multiple DNA methylation-associated features
title_full_unstemmed Prediction of promoters and enhancers using multiple DNA methylation-associated features
title_short Prediction of promoters and enhancers using multiple DNA methylation-associated features
title_sort prediction of promoters and enhancers using multiple dna methylation-associated features
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474542/
https://www.ncbi.nlm.nih.gov/pubmed/26099324
http://dx.doi.org/10.1186/1471-2164-16-S7-S11
work_keys_str_mv AT hwangwoochang predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures
AT oliververityf predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures
AT merbsshannathl predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures
AT zhuheng predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures
AT qianjiang predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures