Cargando…

Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses

BACKGROUND: DNA methylation microarrays are popular for epigenome-wide association studies (EWAS), but spurious values complicate downstream analysis and threaten replication. Conventional cut-offs for detection p values for filtering out undetected probes were demonstrated in a single previous stud...

Descripción completa

Detalles Bibliográficos
Autores principales: Heiss, Jonathan A., Just, Allan C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6346546/
https://www.ncbi.nlm.nih.gov/pubmed/30678737
http://dx.doi.org/10.1186/s13148-019-0615-3
_version_ 1783389774636122112
author Heiss, Jonathan A.
Just, Allan C.
author_facet Heiss, Jonathan A.
Just, Allan C.
author_sort Heiss, Jonathan A.
collection PubMed
description BACKGROUND: DNA methylation microarrays are popular for epigenome-wide association studies (EWAS), but spurious values complicate downstream analysis and threaten replication. Conventional cut-offs for detection p values for filtering out undetected probes were demonstrated in a single previous study as insufficient leading to many apparent methylation calls in samples from females in probes targeting the Y-chromosome. We present an alternative approach to calculate more accurate detection p values utilizing non-specific background fluorescence. We evaluate and compare our proposed approach of filtering observations with conventional ones by assessing the detection of Y-chromosome probes among males and females in 2755 samples from 17 studies on the 450K microarray and masking of large outliers between technical replicates and their impact downstream via an EWAS reanalysis. RESULTS: In contrast to conventional approaches, ours marks most Y-chromosome probes in females as undetected while removing a median of only 0.14% of the data per sample, catches more (30% vs. 6%) of large outliers (more than 20 percentage point difference between technical replicates), and helps to identify strong associations previously obfuscated by outliers between whole blood DNA methylation and chronological age in a well-powered EWAS (n = 729). CONCLUSIONS: We provide guidance for filtering both 450K and EPIC microarrays as an essential preprocessing step to reduce spurious values. An implementation (including a function compatible with objects from the popular minfi package) was added to ewastools, an R package for comprehensive quality control of DNA methylation microarrays. Scripts to reproduce all analyses are available at doi.org/10.5281/zenodo.1443561.
format Online
Article
Text
id pubmed-6346546
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63465462019-01-29 Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses Heiss, Jonathan A. Just, Allan C. Clin Epigenetics Methodology BACKGROUND: DNA methylation microarrays are popular for epigenome-wide association studies (EWAS), but spurious values complicate downstream analysis and threaten replication. Conventional cut-offs for detection p values for filtering out undetected probes were demonstrated in a single previous study as insufficient leading to many apparent methylation calls in samples from females in probes targeting the Y-chromosome. We present an alternative approach to calculate more accurate detection p values utilizing non-specific background fluorescence. We evaluate and compare our proposed approach of filtering observations with conventional ones by assessing the detection of Y-chromosome probes among males and females in 2755 samples from 17 studies on the 450K microarray and masking of large outliers between technical replicates and their impact downstream via an EWAS reanalysis. RESULTS: In contrast to conventional approaches, ours marks most Y-chromosome probes in females as undetected while removing a median of only 0.14% of the data per sample, catches more (30% vs. 6%) of large outliers (more than 20 percentage point difference between technical replicates), and helps to identify strong associations previously obfuscated by outliers between whole blood DNA methylation and chronological age in a well-powered EWAS (n = 729). CONCLUSIONS: We provide guidance for filtering both 450K and EPIC microarrays as an essential preprocessing step to reduce spurious values. An implementation (including a function compatible with objects from the popular minfi package) was added to ewastools, an R package for comprehensive quality control of DNA methylation microarrays. Scripts to reproduce all analyses are available at doi.org/10.5281/zenodo.1443561. BioMed Central 2019-01-24 /pmc/articles/PMC6346546/ /pubmed/30678737 http://dx.doi.org/10.1186/s13148-019-0615-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Heiss, Jonathan A.
Just, Allan C.
Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title_full Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title_fullStr Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title_full_unstemmed Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title_short Improved filtering of DNA methylation microarray data by detection p values and its impact on downstream analyses
title_sort improved filtering of dna methylation microarray data by detection p values and its impact on downstream analyses
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6346546/
https://www.ncbi.nlm.nih.gov/pubmed/30678737
http://dx.doi.org/10.1186/s13148-019-0615-3
work_keys_str_mv AT heissjonathana improvedfilteringofdnamethylationmicroarraydatabydetectionpvaluesanditsimpactondownstreamanalyses
AT justallanc improvedfilteringofdnamethylationmicroarraydatabydetectionpvaluesanditsimpactondownstreamanalyses