Cargando…
Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned
It is well-known, but frequently overlooked, that low- and high-throughput molecular data may contain batch effects, i.e., systematic technical variation. Confounding of experimental batches with the variable(s) of interest is especially concerning, as a batch effect may then be interpreted as a bio...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864890/ https://www.ncbi.nlm.nih.gov/pubmed/29616078 http://dx.doi.org/10.3389/fgene.2018.00083 |
_version_ | 1783308579118252032 |
---|---|
author | Price, E. M. Robinson, Wendy P. |
author_facet | Price, E. M. Robinson, Wendy P. |
author_sort | Price, E. M. |
collection | PubMed |
description | It is well-known, but frequently overlooked, that low- and high-throughput molecular data may contain batch effects, i.e., systematic technical variation. Confounding of experimental batches with the variable(s) of interest is especially concerning, as a batch effect may then be interpreted as a biologically significant finding. An integral step toward reducing false discovery in molecular data analysis includes inspection for batch effects and accounting for this signal if present. In a 30-sample pilot Illumina Infinium HumanMethylation450 (450k array) experiment, we identified two sources of batch effects: row and chip. Here, we demonstrate two approaches taken to process the 450k data in which an R function, ComBat, was applied to adjust for the non-biological signal. In the “initial analysis,” the application of ComBat to an unbalanced study design resulted in 9,612 and 19,214 significant (FDR < 0.05) DNA methylation differences, despite none present prior to correction. Suspicious of this dramatic change, a “revised processing” included changes to our analysis as well as a greater number of samples, and successfully reduced batch effects without introducing false signal. Our work supports conclusions made by an article previously published in this journal: though the ultimate antidote to batch effects is thoughtful study design, every DNA methylation microarray analysis should inspect, assess and, if necessary, account for batch effects. The analysis experience presented here can serve as a reminder to the broader community to establish research questions a priori, ensure that they match with study design and encourage communication between technicians and analysts. |
format | Online Article Text |
id | pubmed-5864890 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-58648902018-04-03 Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned Price, E. M. Robinson, Wendy P. Front Genet Genetics It is well-known, but frequently overlooked, that low- and high-throughput molecular data may contain batch effects, i.e., systematic technical variation. Confounding of experimental batches with the variable(s) of interest is especially concerning, as a batch effect may then be interpreted as a biologically significant finding. An integral step toward reducing false discovery in molecular data analysis includes inspection for batch effects and accounting for this signal if present. In a 30-sample pilot Illumina Infinium HumanMethylation450 (450k array) experiment, we identified two sources of batch effects: row and chip. Here, we demonstrate two approaches taken to process the 450k data in which an R function, ComBat, was applied to adjust for the non-biological signal. In the “initial analysis,” the application of ComBat to an unbalanced study design resulted in 9,612 and 19,214 significant (FDR < 0.05) DNA methylation differences, despite none present prior to correction. Suspicious of this dramatic change, a “revised processing” included changes to our analysis as well as a greater number of samples, and successfully reduced batch effects without introducing false signal. Our work supports conclusions made by an article previously published in this journal: though the ultimate antidote to batch effects is thoughtful study design, every DNA methylation microarray analysis should inspect, assess and, if necessary, account for batch effects. The analysis experience presented here can serve as a reminder to the broader community to establish research questions a priori, ensure that they match with study design and encourage communication between technicians and analysts. Frontiers Media S.A. 2018-03-16 /pmc/articles/PMC5864890/ /pubmed/29616078 http://dx.doi.org/10.3389/fgene.2018.00083 Text en Copyright © 2018 Price and Robinson. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Price, E. M. Robinson, Wendy P. Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title | Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title_full | Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title_fullStr | Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title_full_unstemmed | Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title_short | Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned |
title_sort | adjusting for batch effects in dna methylation microarray data, a lesson learned |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5864890/ https://www.ncbi.nlm.nih.gov/pubmed/29616078 http://dx.doi.org/10.3389/fgene.2018.00083 |
work_keys_str_mv | AT priceem adjustingforbatcheffectsindnamethylationmicroarraydataalessonlearned AT robinsonwendyp adjustingforbatcheffectsindnamethylationmicroarraydataalessonlearned |