Cargando…
seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data
Motivation: One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5013909/ https://www.ncbi.nlm.nih.gov/pubmed/27187204 http://dx.doi.org/10.1093/bioinformatics/btw304 |
_version_ | 1782452237077839872 |
---|---|
author | Kolde, Raivo Märtens, Kaspar Lokk, Kaie Laur, Sven Vilo, Jaak |
author_facet | Kolde, Raivo Märtens, Kaspar Lokk, Kaie Laur, Sven Vilo, Jaak |
author_sort | Kolde, Raivo |
collection | PubMed |
description | Motivation: One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions. So it is more desirable to identify differentially methylated regions (DMRs) instead of DMPs. The new high coverage arrays, like Illuminas 450k platform, make it possible at a reasonable cost. Few tools exist for DMR identification from this type of data, but there is no standard approach. Results: We propose a novel method for DMR identification that detects the region boundaries according to the minimum description length (MDL) principle, essentially solving the problem of model selection. The significance of the regions is established using linear mixed models. Using both simulated and large publicly available methylation datasets, we compare seqlm performance to alternative approaches. We demonstrate that it is both more sensitive and specific than competing methods. This is achieved with minimal parameter tuning and, surprisingly, quickest running time of all the tried methods. Finally, we show that the regional differential methylation patterns identified on sparse array data are confirmed by higher resolution sequencing approaches. Availability and Implementation: The methods have been implemented in R package seqlm that is available through Github: https://github.com/raivokolde/seqlm Contact: rkolde@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5013909 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-50139092016-09-12 seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data Kolde, Raivo Märtens, Kaspar Lokk, Kaie Laur, Sven Vilo, Jaak Bioinformatics Original Papers Motivation: One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions. So it is more desirable to identify differentially methylated regions (DMRs) instead of DMPs. The new high coverage arrays, like Illuminas 450k platform, make it possible at a reasonable cost. Few tools exist for DMR identification from this type of data, but there is no standard approach. Results: We propose a novel method for DMR identification that detects the region boundaries according to the minimum description length (MDL) principle, essentially solving the problem of model selection. The significance of the regions is established using linear mixed models. Using both simulated and large publicly available methylation datasets, we compare seqlm performance to alternative approaches. We demonstrate that it is both more sensitive and specific than competing methods. This is achieved with minimal parameter tuning and, surprisingly, quickest running time of all the tried methods. Finally, we show that the regional differential methylation patterns identified on sparse array data are confirmed by higher resolution sequencing approaches. Availability and Implementation: The methods have been implemented in R package seqlm that is available through Github: https://github.com/raivokolde/seqlm Contact: rkolde@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-09-01 2016-05-13 /pmc/articles/PMC5013909/ /pubmed/27187204 http://dx.doi.org/10.1093/bioinformatics/btw304 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers Kolde, Raivo Märtens, Kaspar Lokk, Kaie Laur, Sven Vilo, Jaak seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title | seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title_full | seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title_fullStr | seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title_full_unstemmed | seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title_short | seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data |
title_sort | seqlm: an mdl based method for identifying differentially methylated regions in high density methylation array data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5013909/ https://www.ncbi.nlm.nih.gov/pubmed/27187204 http://dx.doi.org/10.1093/bioinformatics/btw304 |
work_keys_str_mv | AT kolderaivo seqlmanmdlbasedmethodforidentifyingdifferentiallymethylatedregionsinhighdensitymethylationarraydata AT martenskaspar seqlmanmdlbasedmethodforidentifyingdifferentiallymethylatedregionsinhighdensitymethylationarraydata AT lokkkaie seqlmanmdlbasedmethodforidentifyingdifferentiallymethylatedregionsinhighdensitymethylationarraydata AT laursven seqlmanmdlbasedmethodforidentifyingdifferentiallymethylatedregionsinhighdensitymethylationarraydata AT vilojaak seqlmanmdlbasedmethodforidentifyingdifferentiallymethylatedregionsinhighdensitymethylationarraydata |