Cargando…

Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects

Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and...

Descripción completa

Detalles Bibliográficos
Autores principales: Gradin, Robin, Lindstedt, Malin, Johansson, Henrik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386283/
https://www.ncbi.nlm.nih.gov/pubmed/30794641
http://dx.doi.org/10.1371/journal.pone.0212669
_version_ 1783397353055584256
author Gradin, Robin
Lindstedt, Malin
Johansson, Henrik
author_facet Gradin, Robin
Lindstedt, Malin
Johansson, Henrik
author_sort Gradin, Robin
collection PubMed
description Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and can in certain instances lead to false discoveries. They are especially troublesome in predictive modelling where samples in training sets and test sets are often completely correlated with batches. In this article, we present BARA, a normalization method for adjusting batch effects in predictive modelling. BARA utilizes a few reference samples to adjust for batch effects in a compressed data space spanned by the training set. We evaluate BARA using a collection of publicly available datasets and three different prediction models, and compare its performance to already existing methods developed for similar purposes. The results show that data normalized with BARA generates high and consistent prediction performances. Further, they suggest that BARA produces reliable performances independent of the examined classifiers. We therefore conclude that BARA has great potential to facilitate the development of predictive assays where test sets and training sets are correlated with batch.
format Online
Article
Text
id pubmed-6386283
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63862832019-03-09 Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects Gradin, Robin Lindstedt, Malin Johansson, Henrik PLoS One Research Article Many biological data acquisition platforms suffer from inadvertent inclusion of biologically irrelevant variance in analyzed data, collectively termed batch effects. Batch effects can lead to difficulties in downstream analysis by lowering the power to detect biologically interesting differences and can in certain instances lead to false discoveries. They are especially troublesome in predictive modelling where samples in training sets and test sets are often completely correlated with batches. In this article, we present BARA, a normalization method for adjusting batch effects in predictive modelling. BARA utilizes a few reference samples to adjust for batch effects in a compressed data space spanned by the training set. We evaluate BARA using a collection of publicly available datasets and three different prediction models, and compare its performance to already existing methods developed for similar purposes. The results show that data normalized with BARA generates high and consistent prediction performances. Further, they suggest that BARA produces reliable performances independent of the examined classifiers. We therefore conclude that BARA has great potential to facilitate the development of predictive assays where test sets and training sets are correlated with batch. Public Library of Science 2019-02-22 /pmc/articles/PMC6386283/ /pubmed/30794641 http://dx.doi.org/10.1371/journal.pone.0212669 Text en © 2019 Gradin et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gradin, Robin
Lindstedt, Malin
Johansson, Henrik
Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title_full Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title_fullStr Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title_full_unstemmed Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title_short Batch adjustment by reference alignment (BARA): Improved prediction performance in biological test sets with batch effects
title_sort batch adjustment by reference alignment (bara): improved prediction performance in biological test sets with batch effects
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386283/
https://www.ncbi.nlm.nih.gov/pubmed/30794641
http://dx.doi.org/10.1371/journal.pone.0212669
work_keys_str_mv AT gradinrobin batchadjustmentbyreferencealignmentbaraimprovedpredictionperformanceinbiologicaltestsetswithbatcheffects
AT lindstedtmalin batchadjustmentbyreferencealignmentbaraimprovedpredictionperformanceinbiologicaltestsetswithbatcheffects
AT johanssonhenrik batchadjustmentbyreferencealignmentbaraimprovedpredictionperformanceinbiologicaltestsetswithbatcheffects