Cargando…

Alternative empirical Bayes models for adjusting for batch effects in genomic studies

BACKGROUND: Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yuqing, Jenkins, David F., Manimaran, Solaiappan, Johnson, W. Evan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6044013/
https://www.ncbi.nlm.nih.gov/pubmed/30001694
http://dx.doi.org/10.1186/s12859-018-2263-6
_version_ 1783339397669715968
author Zhang, Yuqing
Jenkins, David F.
Manimaran, Solaiappan
Johnson, W. Evan
author_facet Zhang, Yuqing
Jenkins, David F.
Manimaran, Solaiappan
Johnson, W. Evan
author_sort Zhang, Yuqing
collection PubMed
description BACKGROUND: Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple batches of data that are generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These so-called batch effects often confound true biological relationships in the data, reducing the power benefits of combining multiple batches, and may even lead to spurious results in some combined studies. Therefore there is significant need for effective methods and software tools that account for batch effects in high-throughput genomic studies. RESULTS: Here we contribute multiple methods and software tools for improved combination and analysis of data from multiple batches. In particular, we provide batch effect solutions for cases where the severity of the batch effects is not extreme, and for cases where one high-quality batch can serve as a reference, such as the training set in a biomarker study. We illustrate our approaches and software in both simulated and real data scenarios. CONCLUSIONS: We demonstrate the value of these new contributions compared to currently established approaches in the specified batch correction situations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2263-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6044013
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60440132018-07-13 Alternative empirical Bayes models for adjusting for batch effects in genomic studies Zhang, Yuqing Jenkins, David F. Manimaran, Solaiappan Johnson, W. Evan BMC Bioinformatics Methodology Article BACKGROUND: Combining genomic data sets from multiple studies is advantageous to increase statistical power in studies where logistical considerations restrict sample size or require the sequential generation of data. However, significant technical heterogeneity is commonly observed across multiple batches of data that are generated from different processing or reagent batches, experimenters, protocols, or profiling platforms. These so-called batch effects often confound true biological relationships in the data, reducing the power benefits of combining multiple batches, and may even lead to spurious results in some combined studies. Therefore there is significant need for effective methods and software tools that account for batch effects in high-throughput genomic studies. RESULTS: Here we contribute multiple methods and software tools for improved combination and analysis of data from multiple batches. In particular, we provide batch effect solutions for cases where the severity of the batch effects is not extreme, and for cases where one high-quality batch can serve as a reference, such as the training set in a biomarker study. We illustrate our approaches and software in both simulated and real data scenarios. CONCLUSIONS: We demonstrate the value of these new contributions compared to currently established approaches in the specified batch correction situations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2263-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-07-13 /pmc/articles/PMC6044013/ /pubmed/30001694 http://dx.doi.org/10.1186/s12859-018-2263-6 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Zhang, Yuqing
Jenkins, David F.
Manimaran, Solaiappan
Johnson, W. Evan
Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_full Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_fullStr Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_full_unstemmed Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_short Alternative empirical Bayes models for adjusting for batch effects in genomic studies
title_sort alternative empirical bayes models for adjusting for batch effects in genomic studies
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6044013/
https://www.ncbi.nlm.nih.gov/pubmed/30001694
http://dx.doi.org/10.1186/s12859-018-2263-6
work_keys_str_mv AT zhangyuqing alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies
AT jenkinsdavidf alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies
AT manimaransolaiappan alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies
AT johnsonwevan alternativeempiricalbayesmodelsforadjustingforbatcheffectsingenomicstudies