Cargando…

MBDDiff: an R package designed specifically for processing MBDcap-seq datasets

BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-C...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yuanhang, Wilson, Desiree, Leach, Robin J., Chen, Yidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001203/
https://www.ncbi.nlm.nih.gov/pubmed/27556923
http://dx.doi.org/10.1186/s12864-016-2794-z
_version_ 1782450429421944832
author Liu, Yuanhang
Wilson, Desiree
Leach, Robin J.
Chen, Yidong
author_facet Liu, Yuanhang
Wilson, Desiree
Leach, Robin J.
Chen, Yidong
author_sort Liu, Yuanhang
collection PubMed
description BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-CpG binding domain-based capture followed by high-throughput sequencing (MBDCap-seq) is a widely used method for characterizing DNA methylation patterns in a genome-wide manner. However, current methods for processing MBDCap-seq datasets does not take into account of the region-specific genomic characteristics that might have an impact on the measurements of the amount of methylated DNA (signal) and background fluctuation (noise). Thus, specific software needs to be developed for MBDCap-seq experiments. RESULTS: A new differential methylation quantification algorithm for MBDCap-seq, MBDDiff, was implemented. To evaluate the performance of the MBDDiff algorithm, a set of simulated signal based on negative binomial and Poisson distribution with parameters estimated from real MBDCap-seq datasets accompanied with different background noises were generated, and then performed against a set of commonly used algorithms for MBDCap-seq data analysis in terms of area under the ROC curve (AUC), number of false discoveries and statistical power. In addition, we also demonstrated the effective of MBDDiff algorithm to a set of in-house prostate cancer samples, endometrial cancer samples published earlier, and a set of public-domain triple negative breast cancer samples to identify potential factors that contribute to cancer development and recurrence. CONCLUSIONS: In this paper we developed an algorithm, MBDDiff, designed specifically for datasets derived from MBDCap-seq. MBDDiff contains three modules: quality assessment of datasets and quantification of DNA methylation; determination of differential methylation of promoter regions; and visualization functionalities. Simulation results suggest that MBDDiff performs better compared to MEDIPS and DESeq in terms of AUC and the number of false discoveries at different levels of background noise. MBDDiff outperforms MEDIPS with increased backgrounds noise, but comparable performance when noise level is lower. By applying MBDDiff to several MBDCap-seq datasets, we were able to identify potential targets that contribute to the corresponding biological processes. Taken together, MBDDiff provides user an accurate differential methylation analysis for data generated by MBDCap-seq, especially under noisy conditions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2794-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5001203
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50012032016-09-06 MBDDiff: an R package designed specifically for processing MBDcap-seq datasets Liu, Yuanhang Wilson, Desiree Leach, Robin J. Chen, Yidong BMC Genomics Research BACKGROUND: Since its initial discovery in 1975, DNA methylation has been intensively studied and shown to be involved in various biological processes, such as development, aging and tumor progression. Many experimental techniques have been developed to measure the level of DNA methylation. Methyl-CpG binding domain-based capture followed by high-throughput sequencing (MBDCap-seq) is a widely used method for characterizing DNA methylation patterns in a genome-wide manner. However, current methods for processing MBDCap-seq datasets does not take into account of the region-specific genomic characteristics that might have an impact on the measurements of the amount of methylated DNA (signal) and background fluctuation (noise). Thus, specific software needs to be developed for MBDCap-seq experiments. RESULTS: A new differential methylation quantification algorithm for MBDCap-seq, MBDDiff, was implemented. To evaluate the performance of the MBDDiff algorithm, a set of simulated signal based on negative binomial and Poisson distribution with parameters estimated from real MBDCap-seq datasets accompanied with different background noises were generated, and then performed against a set of commonly used algorithms for MBDCap-seq data analysis in terms of area under the ROC curve (AUC), number of false discoveries and statistical power. In addition, we also demonstrated the effective of MBDDiff algorithm to a set of in-house prostate cancer samples, endometrial cancer samples published earlier, and a set of public-domain triple negative breast cancer samples to identify potential factors that contribute to cancer development and recurrence. CONCLUSIONS: In this paper we developed an algorithm, MBDDiff, designed specifically for datasets derived from MBDCap-seq. MBDDiff contains three modules: quality assessment of datasets and quantification of DNA methylation; determination of differential methylation of promoter regions; and visualization functionalities. Simulation results suggest that MBDDiff performs better compared to MEDIPS and DESeq in terms of AUC and the number of false discoveries at different levels of background noise. MBDDiff outperforms MEDIPS with increased backgrounds noise, but comparable performance when noise level is lower. By applying MBDDiff to several MBDCap-seq datasets, we were able to identify potential targets that contribute to the corresponding biological processes. Taken together, MBDDiff provides user an accurate differential methylation analysis for data generated by MBDCap-seq, especially under noisy conditions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-2794-z) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-18 /pmc/articles/PMC5001203/ /pubmed/27556923 http://dx.doi.org/10.1186/s12864-016-2794-z Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liu, Yuanhang
Wilson, Desiree
Leach, Robin J.
Chen, Yidong
MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title_full MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title_fullStr MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title_full_unstemmed MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title_short MBDDiff: an R package designed specifically for processing MBDcap-seq datasets
title_sort mbddiff: an r package designed specifically for processing mbdcap-seq datasets
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001203/
https://www.ncbi.nlm.nih.gov/pubmed/27556923
http://dx.doi.org/10.1186/s12864-016-2794-z
work_keys_str_mv AT liuyuanhang mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets
AT wilsondesiree mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets
AT leachrobinj mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets
AT chenyidong mbddiffanrpackagedesignedspecificallyforprocessingmbdcapseqdatasets