Cargando…

Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale

Background: Batch effects in DNA methylation microarray experiments can lead to spurious results if not properly handled during the plating of samples. Methods: Two pilot studies examining the association of DNA methylation patterns across the genome with obesity in Samoan men were investigated for...

Descripción completa

Detalles Bibliográficos
Autores principales: Buhule, Olive D., Minster, Ryan L., Hawley, Nicola L., Medvedovic, Mario, Sun, Guangyun, Viali, Satupaitea, Deka, Ranjan, McGarvey, Stephen T., Weeks, Daniel E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4195366/
https://www.ncbi.nlm.nih.gov/pubmed/25352862
http://dx.doi.org/10.3389/fgene.2014.00354
_version_ 1782339300455612416
author Buhule, Olive D.
Minster, Ryan L.
Hawley, Nicola L.
Medvedovic, Mario
Sun, Guangyun
Viali, Satupaitea
Deka, Ranjan
McGarvey, Stephen T.
Weeks, Daniel E.
author_facet Buhule, Olive D.
Minster, Ryan L.
Hawley, Nicola L.
Medvedovic, Mario
Sun, Guangyun
Viali, Satupaitea
Deka, Ranjan
McGarvey, Stephen T.
Weeks, Daniel E.
author_sort Buhule, Olive D.
collection PubMed
description Background: Batch effects in DNA methylation microarray experiments can lead to spurious results if not properly handled during the plating of samples. Methods: Two pilot studies examining the association of DNA methylation patterns across the genome with obesity in Samoan men were investigated for chip- and row-specific batch effects. For each study, the DNA of 46 obese men and 46 lean men were assayed using Illumina's Infinium HumanMethylation450 BeadChip. In the first study (Sample One), samples from obese and lean subjects were examined on separate chips. In the second study (Sample Two), the samples were balanced on the chips by lean/obese status, age group, and census region. We used methylumi, watermelon, and limma R packages, as well as ComBat, to analyze the data. Principal component analysis and linear regression were, respectively, employed to identify the top principal components and to test for their association with the batches and lean/obese status. To identify differentially methylated positions (DMPs) between obese and lean males at each locus, we used a moderated t-test. Results: Chip effects were effectively removed from Sample Two but not Sample One. In addition, dramatic differences were observed between the two sets of DMP results. After “removing” batch effects with ComBat, Sample One had 94,191 probes differentially methylated at a q-value threshold of 0.05 while Sample Two had zero differentially methylated probes. The disparate results from Sample One and Sample Two likely arise due to the confounding of lean/obese status with chip and row batch effects. Conclusion: Even the best possible statistical adjustments for batch effects may not completely remove them. Proper study design is vital for guarding against spurious findings due to such effects.
format Online
Article
Text
id pubmed-4195366
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-41953662014-10-28 Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale Buhule, Olive D. Minster, Ryan L. Hawley, Nicola L. Medvedovic, Mario Sun, Guangyun Viali, Satupaitea Deka, Ranjan McGarvey, Stephen T. Weeks, Daniel E. Front Genet Genetics Background: Batch effects in DNA methylation microarray experiments can lead to spurious results if not properly handled during the plating of samples. Methods: Two pilot studies examining the association of DNA methylation patterns across the genome with obesity in Samoan men were investigated for chip- and row-specific batch effects. For each study, the DNA of 46 obese men and 46 lean men were assayed using Illumina's Infinium HumanMethylation450 BeadChip. In the first study (Sample One), samples from obese and lean subjects were examined on separate chips. In the second study (Sample Two), the samples were balanced on the chips by lean/obese status, age group, and census region. We used methylumi, watermelon, and limma R packages, as well as ComBat, to analyze the data. Principal component analysis and linear regression were, respectively, employed to identify the top principal components and to test for their association with the batches and lean/obese status. To identify differentially methylated positions (DMPs) between obese and lean males at each locus, we used a moderated t-test. Results: Chip effects were effectively removed from Sample Two but not Sample One. In addition, dramatic differences were observed between the two sets of DMP results. After “removing” batch effects with ComBat, Sample One had 94,191 probes differentially methylated at a q-value threshold of 0.05 while Sample Two had zero differentially methylated probes. The disparate results from Sample One and Sample Two likely arise due to the confounding of lean/obese status with chip and row batch effects. Conclusion: Even the best possible statistical adjustments for batch effects may not completely remove them. Proper study design is vital for guarding against spurious findings due to such effects. Frontiers Media S.A. 2014-10-13 /pmc/articles/PMC4195366/ /pubmed/25352862 http://dx.doi.org/10.3389/fgene.2014.00354 Text en Copyright © 2014 Buhule, Minster, Hawley, Medvedovic, Sun, Viali, Deka, McGarvey and Weeks. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Buhule, Olive D.
Minster, Ryan L.
Hawley, Nicola L.
Medvedovic, Mario
Sun, Guangyun
Viali, Satupaitea
Deka, Ranjan
McGarvey, Stephen T.
Weeks, Daniel E.
Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title_full Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title_fullStr Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title_full_unstemmed Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title_short Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale
title_sort stratified randomization controls better for batch effects in 450k methylation analysis: a cautionary tale
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4195366/
https://www.ncbi.nlm.nih.gov/pubmed/25352862
http://dx.doi.org/10.3389/fgene.2014.00354
work_keys_str_mv AT buhuleolived stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT minsterryanl stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT hawleynicolal stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT medvedovicmario stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT sunguangyun stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT vialisatupaitea stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT dekaranjan stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT mcgarveystephent stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale
AT weeksdaniele stratifiedrandomizationcontrolsbetterforbatcheffectsin450kmethylationanalysisacautionarytale