Cargando…
MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-C...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001203/ https://www.ncbi.nlm.nih.gov/pubmed/27556923 http://dx.doi.org/10.1186/s12864-016-2794-z |
_version_ | 1782450429421944832 |
---|---|
author | Liu, Yuanhang Wilson, Desiree Leach, Robin J. Chen, Yidong |
author_facet | Liu, Yuanhang Wilson, Desiree Leach, Robin J. Chen, Yidong |
author_sort | Liu, Yuanhang |
collection | PubMed |
description | BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-CpG binding domain-based capture followed by high-throughput sequencing (MBDCap-seq) is a widely used method for characterizing DNA methylation patterns in a genome-wide manner. However, current methods for processing MBDCap-seq datasets does not take into account of the region-specific genomic characteristics that might have an impact on the measurements of the amount of methylated DNA (signal) and background fluctuation (noise). Thus, specific software needs to be developed for MBDCap-seq experiments. RESULTS: A new differential methylation quantification algorithm for MBDCap-seq, MBDDiff, was implemented. To evaluate the performance of the MBDDiff algorithm, a set of simulated signal based on negative binomial and Poisson distribution with parameters estimated from real MBDCap-seq datasets accompanied with different background noises were generated, and then performed against a set of commonly used algorithms for MBDCap-seq data analysis in terms of area under the ROC curve (AUC), number of false discoveries and statistical power. In addition, we also demonstrated the effective of MBDDiff algorithm to a set of in-house prostate cancer samples, endometrial cancer samples published earlier, and a set of public-domain triple negative breast cancer samples to identify potential factors that contribute to cancer development and recurrence. CONCLUSIONS: In this paper we developed an algorithm, MBDDiff, designed specifically for datasets derived from MBDCap-seq. MBDDiff contains three modules: quality assessment of datasets and quantification of DNA methylation; determination of differential methylation of promoter regions; and visualization functionalities. Simulation results suggest that MBDDiff performs better compared to MEDIPS and DESeq in terms of AUC and the number of false discoveries at different levels of background noise. MBDDiff outperforms MEDIPS with increased backgrounds noise, but comparable performance when noise level is lower. By applying MBDDiff to several MBDCap-seq datasets, we were able to identify potential targets that contribute to the corresponding biological processes. Taken together, MBDDiff provides user an accurate differential methylation analysis for data generated by MBDCap-seq, especially under noisy conditions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2794-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5001203 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-50012032016-09-06 MBDDiff: an R package designed specifically for processing MBDcap-seq datasets Liu, Yuanhang Wilson, Desiree Leach, Robin J. Chen, Yidong BMC Genomics Research BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-CpG binding domain-based capture followed by high-throughput sequencing (MBDCap-seq) is a widely used method for characterizing DNA methylation patterns in a genome-wide manner. However, current methods for processing MBDCap-seq datasets does not take into account of the region-specific genomic characteristics that might have an impact on the measurements of the amount of methylated DNA (signal) and background fluctuation (noise). Thus, specific software needs to be developed for MBDCap-seq experiments. RESULTS: A new differential methylation quantification algorithm for MBDCap-seq, MBDDiff, was implemented. To evaluate the performance of the MBDDiff algorithm, a set of simulated signal based on negative binomial and Poisson distribution with parameters estimated from real MBDCap-seq datasets accompanied with different background noises were generated, and then performed against a set of commonly used algorithms for MBDCap-seq data analysis in terms of area under the ROC curve (AUC), number of false discoveries and statistical power. In addition, we also demonstrated the effective of MBDDiff algorithm to a set of in-house prostate cancer samples, endometrial cancer samples published earlier, and a set of public-domain triple negative breast cancer samples to identify potential factors that contribute to cancer development and recurrence. CONCLUSIONS: In this paper we developed an algorithm, MBDDiff, designed specifically for datasets derived from MBDCap-seq. MBDDiff contains three modules: quality assessment of datasets and quantification of DNA methylation; determination of differential methylation of promoter regions; and visualization functionalities. Simulation results suggest that MBDDiff performs better compared to MEDIPS and DESeq in terms of AUC and the number of false discoveries at different levels of background noise. MBDDiff outperforms MEDIPS with increased backgrounds noise, but comparable performance when noise level is lower. By applying MBDDiff to several MBDCap-seq datasets, we were able to identify potential targets that contribute to the corresponding biological processes. Taken together, MBDDiff provides user an accurate differential methylation analysis for data generated by MBDCap-seq, especially under noisy conditions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2794-z) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-18 /pmc/articles/PMC5001203/ /pubmed/27556923 http://dx.doi.org/10.1186/s12864-016-2794-z Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Liu, Yuanhang Wilson, Desiree Leach, Robin J. Chen, Yidong MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title | MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title_full | MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title_fullStr | MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title_full_unstemmed | MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title_short | MBDDiff: an R package designed specifically for processing MBDcap-seq datasets |
title_sort | mbddiff: an r package designed specifically for processing mbdcap-seq datasets |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001203/ https://www.ncbi.nlm.nih.gov/pubmed/27556923 http://dx.doi.org/10.1186/s12864-016-2794-z |
work_keys_str_mv | AT liuyuanhang mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets AT wilsondesiree mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets AT leachrobinj mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets AT chenyidong mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets |