Cargando…

Embracing noise to improve cross-batch prediction accuracy

One important application of microarray in clinical settings is for constructing a diagnosis or prognosis model. Batch effects are a well-known obstacle in this type of applications. Recently, a prominent study was published on how batch effects removal techniques could potentially improve microarra...

Descripción completa

Detalles Bibliográficos
Autores principales:	Koh, Chuan Hock, Wong, Limsoon
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521182/ https://www.ncbi.nlm.nih.gov/pubmed/23282067 http://dx.doi.org/10.1186/1752-0509-6-S2-S3

_version_	1782252899859955712
author	Koh, Chuan Hock Wong, Limsoon
author_facet	Koh, Chuan Hock Wong, Limsoon
author_sort	Koh, Chuan Hock
collection	PubMed
description	One important application of microarray in clinical settings is for constructing a diagnosis or prognosis model. Batch effects are a well-known obstacle in this type of applications. Recently, a prominent study was published on how batch effects removal techniques could potentially improve microarray prediction performance. However, the results were not very encouraging, as prediction performance did not always improve. In fact, in up to 20% of the cases, prediction accuracy was reduced. Furthermore, it was stated in the paper that the techniques studied require sufficiently large sample sizes in both batches (train and test) to be effective, which is not a realistic situation especially in clinical settings. In this paper, we propose a different approach, which is able to overcome limitations faced by conventional methods. Our approach uses ranking value of microarray data and a bagging ensemble classifier with sequential hypothesis testing to dynamically determine the number of classifiers required in the ensemble. Using similar datasets to those in the original study, we showed that in only one case (<2%) is our performance reduced (by more than -0.05 AUC) and, in >60% of cases, it is improved (by more than 0.05 AUC). In addition, our approach works even on much smaller training data sets and is independent of the sample size of the test data, making it feasible to be applied on clinical studies.
format	Online Article Text
id	pubmed-3521182
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35211822012-12-14 Embracing noise to improve cross-batch prediction accuracy Koh, Chuan Hock Wong, Limsoon BMC Syst Biol Proceedings One important application of microarray in clinical settings is for constructing a diagnosis or prognosis model. Batch effects are a well-known obstacle in this type of applications. Recently, a prominent study was published on how batch effects removal techniques could potentially improve microarray prediction performance. However, the results were not very encouraging, as prediction performance did not always improve. In fact, in up to 20% of the cases, prediction accuracy was reduced. Furthermore, it was stated in the paper that the techniques studied require sufficiently large sample sizes in both batches (train and test) to be effective, which is not a realistic situation especially in clinical settings. In this paper, we propose a different approach, which is able to overcome limitations faced by conventional methods. Our approach uses ranking value of microarray data and a bagging ensemble classifier with sequential hypothesis testing to dynamically determine the number of classifiers required in the ensemble. Using similar datasets to those in the original study, we showed that in only one case (<2%) is our performance reduced (by more than -0.05 AUC) and, in >60% of cases, it is improved (by more than 0.05 AUC). In addition, our approach works even on much smaller training data sets and is independent of the sample size of the test data, making it feasible to be applied on clinical studies. BioMed Central 2012-12-12 /pmc/articles/PMC3521182/ /pubmed/23282067 http://dx.doi.org/10.1186/1752-0509-6-S2-S3 Text en Copyright ©2012 Koh and Wong; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Koh, Chuan Hock Wong, Limsoon Embracing noise to improve cross-batch prediction accuracy
title	Embracing noise to improve cross-batch prediction accuracy
title_full	Embracing noise to improve cross-batch prediction accuracy
title_fullStr	Embracing noise to improve cross-batch prediction accuracy
title_full_unstemmed	Embracing noise to improve cross-batch prediction accuracy
title_short	Embracing noise to improve cross-batch prediction accuracy
title_sort	embracing noise to improve cross-batch prediction accuracy
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521182/ https://www.ncbi.nlm.nih.gov/pubmed/23282067 http://dx.doi.org/10.1186/1752-0509-6-S2-S3
work_keys_str_mv	AT kohchuanhock embracingnoisetoimprovecrossbatchpredictionaccuracy AT wonglimsoon embracingnoisetoimprovecrossbatchpredictionaccuracy

Embracing noise to improve cross-batch prediction accuracy

Ejemplares similares