Cargando…

Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency

MOTIVATION: When we were asked for help with high-level microarray data analysis (on Affymetrix HGU-133A microarray), we faced the problem of selecting an appropriate method. We wanted to select a method that would yield "the best result" (detected as many "really" differentially...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chrominski, Kornel, Tkacz, Magdalena
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4461299/ https://www.ncbi.nlm.nih.gov/pubmed/26057385 http://dx.doi.org/10.1371/journal.pone.0128845

_version_	1782375514633142272
author	Chrominski, Kornel Tkacz, Magdalena
author_facet	Chrominski, Kornel Tkacz, Magdalena
author_sort	Chrominski, Kornel
collection	PubMed
description	MOTIVATION: When we were asked for help with high-level microarray data analysis (on Affymetrix HGU-133A microarray), we faced the problem of selecting an appropriate method. We wanted to select a method that would yield "the best result" (detected as many "really" differentially expressed genes (DEGs) as possible, without false positives and false negatives). However, life scientists could not help us – they use their "favorite" method without special argumentation. We also did not find any norm or recommendation. Therefore, we decided to examine it for our own purpose. We considered whether the results obtained using different methods of high-level microarray data analyses – Significant Analysis of Microarrays, Rank Products, Bland-Altman, Mann-Whitney test, T test and the Linear Models for Microarray Data – would be in agreement. Initially, we conducted a comparative analysis of the results on eight real data sets from microarray experiments (from the Array Express database). The results were surprising. On the same array set, the set of DEGs by different methods were significantly different. We also applied the methods to artificial data sets and determined some measures that allow the preparation of the overall scoring of tested methods for future recommendation. RESULTS: We found a very low level concordance of results from tested methods on real array sets. The number of common DEGs (detected by all six methods on fixed array sets, checked on eight array sets) ranged from 6 to 433 (22,283 total array readings). Results on artificial data sets were better than those on the real data. However, they were not fully satisfying. We scored tested methods on accuracy, recall, precision, f-measure and Matthews correlation coefficient. Based on the overall scoring, the best methods were SAM and LIMMA. We also found TT to be acceptable. The worst scoring was MW. Based on our study, we recommend: 1. Carefully taking into account the need for study when choosing a method, 2. Making high-level analysis with more than one method and then only taking the genes that are common to all methods (which seems to be reasonable) and 3. Being very careful (while summarizing facts) about sets of differentially expressed genes: different methods discover different sets of DEGs.
format	Online Article Text
id	pubmed-4461299
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-44612992015-06-16 Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency Chrominski, Kornel Tkacz, Magdalena PLoS One Research Article MOTIVATION: When we were asked for help with high-level microarray data analysis (on Affymetrix HGU-133A microarray), we faced the problem of selecting an appropriate method. We wanted to select a method that would yield "the best result" (detected as many "really" differentially expressed genes (DEGs) as possible, without false positives and false negatives). However, life scientists could not help us – they use their "favorite" method without special argumentation. We also did not find any norm or recommendation. Therefore, we decided to examine it for our own purpose. We considered whether the results obtained using different methods of high-level microarray data analyses – Significant Analysis of Microarrays, Rank Products, Bland-Altman, Mann-Whitney test, T test and the Linear Models for Microarray Data – would be in agreement. Initially, we conducted a comparative analysis of the results on eight real data sets from microarray experiments (from the Array Express database). The results were surprising. On the same array set, the set of DEGs by different methods were significantly different. We also applied the methods to artificial data sets and determined some measures that allow the preparation of the overall scoring of tested methods for future recommendation. RESULTS: We found a very low level concordance of results from tested methods on real array sets. The number of common DEGs (detected by all six methods on fixed array sets, checked on eight array sets) ranged from 6 to 433 (22,283 total array readings). Results on artificial data sets were better than those on the real data. However, they were not fully satisfying. We scored tested methods on accuracy, recall, precision, f-measure and Matthews correlation coefficient. Based on the overall scoring, the best methods were SAM and LIMMA. We also found TT to be acceptable. The worst scoring was MW. Based on our study, we recommend: 1. Carefully taking into account the need for study when choosing a method, 2. Making high-level analysis with more than one method and then only taking the genes that are common to all methods (which seems to be reasonable) and 3. Being very careful (while summarizing facts) about sets of differentially expressed genes: different methods discover different sets of DEGs. Public Library of Science 2015-06-09 /pmc/articles/PMC4461299/ /pubmed/26057385 http://dx.doi.org/10.1371/journal.pone.0128845 Text en © 2015 Chrominski, Tkacz http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Chrominski, Kornel Tkacz, Magdalena Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title	Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title_full	Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title_fullStr	Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title_full_unstemmed	Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title_short	Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency
title_sort	comparison of high-level microarray analysis methods in the context of result consistency
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4461299/ https://www.ncbi.nlm.nih.gov/pubmed/26057385 http://dx.doi.org/10.1371/journal.pone.0128845
work_keys_str_mv	AT chrominskikornel comparisonofhighlevelmicroarrayanalysismethodsinthecontextofresultconsistency AT tkaczmagdalena comparisonofhighlevelmicroarrayanalysismethodsinthecontextofresultconsistency

Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency

Ejemplares similares