Cargando…
DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are di...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5817627/ https://www.ncbi.nlm.nih.gov/pubmed/29187143 http://dx.doi.org/10.1186/s12859-017-1909-0 |
_version_ | 1783300898081996800 |
---|---|
author | Gaspar, John M. Hart, Ronald P. |
author_facet | Gaspar, John M. Hart, Ronald P. |
author_sort | Gaspar, John M. |
collection | PubMed |
description | BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. RESULTS: DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. CONCLUSIONS: Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1909-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5817627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-58176272018-02-23 DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data Gaspar, John M. Hart, Ronald P. BMC Bioinformatics Software BACKGROUND: DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. RESULTS: DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. CONCLUSIONS: Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1909-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-11-29 /pmc/articles/PMC5817627/ /pubmed/29187143 http://dx.doi.org/10.1186/s12859-017-1909-0 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Gaspar, John M. Hart, Ronald P. DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title | DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title_full | DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title_fullStr | DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title_full_unstemmed | DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title_short | DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data |
title_sort | dmrfinder: efficiently identifying differentially methylated regions from methylc-seq data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5817627/ https://www.ncbi.nlm.nih.gov/pubmed/29187143 http://dx.doi.org/10.1186/s12859-017-1909-0 |
work_keys_str_mv | AT gasparjohnm dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata AT hartronaldp dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata |