Cargando…
An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs
BACKGROUND: Population based epigenetic association studies of disease and exposures are becoming more common with the availability of economical genome-wide technologies for interrogation of the methylome, such as the Illumina 450K Human Methylation Array (450K). Often, the expected small number of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5290610/ https://www.ncbi.nlm.nih.gov/pubmed/28184257 http://dx.doi.org/10.1186/s13148-017-0320-z |
_version_ | 1782504666483916800 |
---|---|
author | Edgar, Rachel D. Jones, Meaghan J. Robinson, Wendy P. Kobor, Michael S. |
author_facet | Edgar, Rachel D. Jones, Meaghan J. Robinson, Wendy P. Kobor, Michael S. |
author_sort | Edgar, Rachel D. |
collection | PubMed |
description | BACKGROUND: Population based epigenetic association studies of disease and exposures are becoming more common with the availability of economical genome-wide technologies for interrogation of the methylome, such as the Illumina 450K Human Methylation Array (450K). Often, the expected small number of differentially methylated cytosine-guanine pairs (CpGs) in studies of the human methylome presents a statistical challenge, as the large number of CpGs measured on the 450K necessitates careful multiple test correction. While the 450K is a highly useful tool for population epigenetic studies, many of the CpGs tested are not variable and thus of limited information content in the context of the study and tissue. CpGs with observed lack of variability in the tissue under study could be removed to reduce the data dimensionality, limit the severity of multiple test correction and allow for improved detection of differential DNA methylation. METHODS: Here, we performed a meta-analysis of 450K data from three commonly studied human tissues, namely blood (605 samples), buccal epithelial cells (121 samples) and placenta (157 samples). We developed lists of CpGs that are non-variable in each tissue. RESULTS: These lists are surprisingly large (blood 114,204 CpGs, buccal epithelial cells 120,009 CpGs and placenta 101,367 CpGs) and thus will be valuable filters for epigenetic association studies, considerably reducing the dimensionality of the 450K and subsequently the multiple testing correction severity. CONCLUSIONS: We propose this empirically derived method for data reduction to allow for more power in detecting differential DNA methylation associated with exposures in studies on the human methylome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13148-017-0320-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5290610 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52906102017-02-09 An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs Edgar, Rachel D. Jones, Meaghan J. Robinson, Wendy P. Kobor, Michael S. Clin Epigenetics Methodology BACKGROUND: Population based epigenetic association studies of disease and exposures are becoming more common with the availability of economical genome-wide technologies for interrogation of the methylome, such as the Illumina 450K Human Methylation Array (450K). Often, the expected small number of differentially methylated cytosine-guanine pairs (CpGs) in studies of the human methylome presents a statistical challenge, as the large number of CpGs measured on the 450K necessitates careful multiple test correction. While the 450K is a highly useful tool for population epigenetic studies, many of the CpGs tested are not variable and thus of limited information content in the context of the study and tissue. CpGs with observed lack of variability in the tissue under study could be removed to reduce the data dimensionality, limit the severity of multiple test correction and allow for improved detection of differential DNA methylation. METHODS: Here, we performed a meta-analysis of 450K data from three commonly studied human tissues, namely blood (605 samples), buccal epithelial cells (121 samples) and placenta (157 samples). We developed lists of CpGs that are non-variable in each tissue. RESULTS: These lists are surprisingly large (blood 114,204 CpGs, buccal epithelial cells 120,009 CpGs and placenta 101,367 CpGs) and thus will be valuable filters for epigenetic association studies, considerably reducing the dimensionality of the 450K and subsequently the multiple testing correction severity. CONCLUSIONS: We propose this empirically derived method for data reduction to allow for more power in detecting differential DNA methylation associated with exposures in studies on the human methylome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13148-017-0320-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-02-02 /pmc/articles/PMC5290610/ /pubmed/28184257 http://dx.doi.org/10.1186/s13148-017-0320-z Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Edgar, Rachel D. Jones, Meaghan J. Robinson, Wendy P. Kobor, Michael S. An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title | An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title_full | An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title_fullStr | An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title_full_unstemmed | An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title_short | An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs |
title_sort | empirically driven data reduction method on the human 450k methylation array to remove tissue specific non-variable cpgs |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5290610/ https://www.ncbi.nlm.nih.gov/pubmed/28184257 http://dx.doi.org/10.1186/s13148-017-0320-z |
work_keys_str_mv | AT edgarracheld anempiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT jonesmeaghanj anempiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT robinsonwendyp anempiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT kobormichaels anempiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT edgarracheld empiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT jonesmeaghanj empiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT robinsonwendyp empiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs AT kobormichaels empiricallydrivendatareductionmethodonthehuman450kmethylationarraytoremovetissuespecificnonvariablecpgs |