Cargando…

DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data

BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are di...

Descripción completa

Detalles Bibliográficos
Autores principales: Gaspar, John M., Hart, Ronald P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5817627/
https://www.ncbi.nlm.nih.gov/pubmed/29187143
http://dx.doi.org/10.1186/s12859-017-1909-0
_version_ 1783300898081996800
author Gaspar, John M.
Hart, Ronald P.
author_facet Gaspar, John M.
Hart, Ronald P.
author_sort Gaspar, John M.
collection PubMed
description BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. RESULTS: DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. CONCLUSIONS: Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1909-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5817627
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58176272018-02-23 DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data Gaspar, John M. Hart, Ronald P. BMC Bioinformatics Software BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. RESULTS: DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. CONCLUSIONS: Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1909-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-11-29 /pmc/articles/PMC5817627/ /pubmed/29187143 http://dx.doi.org/10.1186/s12859-017-1909-0 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Gaspar, John M.
Hart, Ronald P.
DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_full DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_fullStr DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_full_unstemmed DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_short DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_sort dmrfinder: efficiently identifying differentially methylated regions from methylc-seq data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5817627/
https://www.ncbi.nlm.nih.gov/pubmed/29187143
http://dx.doi.org/10.1186/s12859-017-1909-0
work_keys_str_mv AT gasparjohnm dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata
AT hartronaldp dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata