Cargando…

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity

BACKGROUND: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommende...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kadota, Koji, Nakai, Yuji, Shimizu, Kentaro
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679019/ https://www.ncbi.nlm.nih.gov/pubmed/19386098 http://dx.doi.org/10.1186/1748-7188-4-7

_version_	1782166869124317184
author	Kadota, Koji Nakai, Yuji Shimizu, Kentaro
author_facet	Kadota, Koji Nakai, Yuji Shimizu, Kentaro
author_sort	Kadota, Koji
collection	PubMed
description	BACKGROUND: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. RESULTS: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. CONCLUSION: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
format	Text
id	pubmed-2679019
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26790192009-05-08 Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity Kadota, Koji Nakai, Yuji Shimizu, Kentaro Algorithms Mol Biol Research BACKGROUND: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. RESULTS: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. CONCLUSION: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD. BioMed Central 2009-04-22 /pmc/articles/PMC2679019/ /pubmed/19386098 http://dx.doi.org/10.1186/1748-7188-4-7 Text en Copyright © 2009 Kadota et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Kadota, Koji Nakai, Yuji Shimizu, Kentaro Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title	Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title_full	Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title_fullStr	Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title_full_unstemmed	Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title_short	Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
title_sort	ranking differentially expressed genes from affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679019/ https://www.ncbi.nlm.nih.gov/pubmed/19386098 http://dx.doi.org/10.1186/1748-7188-4-7
work_keys_str_mv	AT kadotakoji rankingdifferentiallyexpressedgenesfromaffymetrixgeneexpressiondatamethodswithreproducibilitysensitivityandspecificity AT nakaiyuji rankingdifferentiallyexpressedgenesfromaffymetrixgeneexpressiondatamethodswithreproducibilitysensitivityandspecificity AT shimizukentaro rankingdifferentiallyexpressedgenesfromaffymetrixgeneexpressiondatamethodswithreproducibilitysensitivityandspecificity

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity

Ejemplares similares