Cargando…
Prediction of promoters and enhancers using multiple DNA methylation-associated features
BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, th...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474542/ https://www.ncbi.nlm.nih.gov/pubmed/26099324 http://dx.doi.org/10.1186/1471-2164-16-S7-S11 |
_version_ | 1782377286704562176 |
---|---|
author | Hwang, Woochang Oliver, Verity F Merbs, Shannath L Zhu, Heng Qian, Jiang |
author_facet | Hwang, Woochang Oliver, Verity F Merbs, Shannath L Zhu, Heng Qian, Jiang |
author_sort | Hwang, Woochang |
collection | PubMed |
description | BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, these DNA sequence-based approaches do not reflect the tissue-specific nature of the regulatory regions. In this work, we propose to predict regulatory regions using multiple features derived from DNA methylation profile. RESULTS: We discovered several interesting features of the methylated CpG (mCpG) sites within regulatory regions. First, a hypomethylation status of CpGs within regulatory regions, compared to the genomic background methylation level, extended out >1000 bp from the center of the regulatory regions, demonstrating a high degree of correlation between the methylation statuses of neighboring mCpG sites. Second, when a regulatory region was inactive, as determined by histone mark differences between cell lines, methylation level of the mCpG site increased from a hypomethylated state to a hypermethylated state, the level of which was even higher than the genomic background. Third, a distinct set of sequence motifs was overrepresented surrounding mCpG sites within regulatory regions. Using 5 types of features derived from DNA methylation profiles, we were able to predict promoters and enhancers using machine-learning approach (support vector machine). The performances for prediction of promoters and enhancers are quite well, showing an area under the ROC curve (AUC) of 0.992 and 0.817, respectively, which is better than that simply based on methylation level, especially for prediction of enhancers. CONCLUSIONS: Our study suggests that DNA methylation features of mCpG sites can be used to predict regulatory regions. |
format | Online Article Text |
id | pubmed-4474542 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44745422015-06-25 Prediction of promoters and enhancers using multiple DNA methylation-associated features Hwang, Woochang Oliver, Verity F Merbs, Shannath L Zhu, Heng Qian, Jiang BMC Genomics Research BACKGROUND: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, these DNA sequence-based approaches do not reflect the tissue-specific nature of the regulatory regions. In this work, we propose to predict regulatory regions using multiple features derived from DNA methylation profile. RESULTS: We discovered several interesting features of the methylated CpG (mCpG) sites within regulatory regions. First, a hypomethylation status of CpGs within regulatory regions, compared to the genomic background methylation level, extended out >1000 bp from the center of the regulatory regions, demonstrating a high degree of correlation between the methylation statuses of neighboring mCpG sites. Second, when a regulatory region was inactive, as determined by histone mark differences between cell lines, methylation level of the mCpG site increased from a hypomethylated state to a hypermethylated state, the level of which was even higher than the genomic background. Third, a distinct set of sequence motifs was overrepresented surrounding mCpG sites within regulatory regions. Using 5 types of features derived from DNA methylation profiles, we were able to predict promoters and enhancers using machine-learning approach (support vector machine). The performances for prediction of promoters and enhancers are quite well, showing an area under the ROC curve (AUC) of 0.992 and 0.817, respectively, which is better than that simply based on methylation level, especially for prediction of enhancers. CONCLUSIONS: Our study suggests that DNA methylation features of mCpG sites can be used to predict regulatory regions. BioMed Central 2015-06-11 /pmc/articles/PMC4474542/ /pubmed/26099324 http://dx.doi.org/10.1186/1471-2164-16-S7-S11 Text en Copyright © 2015 Hwang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Hwang, Woochang Oliver, Verity F Merbs, Shannath L Zhu, Heng Qian, Jiang Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title | Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title_full | Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title_fullStr | Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title_full_unstemmed | Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title_short | Prediction of promoters and enhancers using multiple DNA methylation-associated features |
title_sort | prediction of promoters and enhancers using multiple dna methylation-associated features |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474542/ https://www.ncbi.nlm.nih.gov/pubmed/26099324 http://dx.doi.org/10.1186/1471-2164-16-S7-S11 |
work_keys_str_mv | AT hwangwoochang predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures AT oliververityf predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures AT merbsshannathl predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures AT zhuheng predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures AT qianjiang predictionofpromotersandenhancersusingmultiplednamethylationassociatedfeatures |