Cargando…

Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays

Gene expression array technology has reached the stage of being routinely used to study clinical samples in search of diagnostic and prognostic biomarkers. Due to the nature of array experiments, which examine the expression of tens of thousands of genes simultaneously, the number of null hypotheses...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Jun, Kerns, Robnet T., Peddada, Shyamal D., Bushel, Pierre R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3141272/
https://www.ncbi.nlm.nih.gov/pubmed/21525126
http://dx.doi.org/10.1093/nar/gkr241
_version_ 1782208651601117184
author Lu, Jun
Kerns, Robnet T.
Peddada, Shyamal D.
Bushel, Pierre R.
author_facet Lu, Jun
Kerns, Robnet T.
Peddada, Shyamal D.
Bushel, Pierre R.
author_sort Lu, Jun
collection PubMed
description Gene expression array technology has reached the stage of being routinely used to study clinical samples in search of diagnostic and prognostic biomarkers. Due to the nature of array experiments, which examine the expression of tens of thousands of genes simultaneously, the number of null hypotheses is large. Hence, multiple testing correction is often necessary to control the number of false positives. However, multiple testing correction can lead to low statistical power in detecting genes that are truly differentially expressed. Filtering out non-informative genes allows for reduction in the number of null hypotheses. While several filtering methods have been suggested, the appropriate way to perform filtering is still debatable. We propose a new filtering strategy for Affymetrix GeneChips®, based on principal component analysis of probe-level gene expression data. Using a wholly defined spike-in data set and one from a diabetes study, we show that filtering by the proportion of variation accounted for by the first principal component (PVAC) provides increased sensitivity in detecting truly differentially expressed genes while controlling false discoveries. We demonstrate that PVAC exhibits equal or better performance than several widely used filtering methods. Furthermore, a data-driven approach that guides the selection of the filtering threshold value is also proposed.
format Online
Article
Text
id pubmed-3141272
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31412722011-07-22 Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays Lu, Jun Kerns, Robnet T. Peddada, Shyamal D. Bushel, Pierre R. Nucleic Acids Res Methods Online Gene expression array technology has reached the stage of being routinely used to study clinical samples in search of diagnostic and prognostic biomarkers. Due to the nature of array experiments, which examine the expression of tens of thousands of genes simultaneously, the number of null hypotheses is large. Hence, multiple testing correction is often necessary to control the number of false positives. However, multiple testing correction can lead to low statistical power in detecting genes that are truly differentially expressed. Filtering out non-informative genes allows for reduction in the number of null hypotheses. While several filtering methods have been suggested, the appropriate way to perform filtering is still debatable. We propose a new filtering strategy for Affymetrix GeneChips®, based on principal component analysis of probe-level gene expression data. Using a wholly defined spike-in data set and one from a diabetes study, we show that filtering by the proportion of variation accounted for by the first principal component (PVAC) provides increased sensitivity in detecting truly differentially expressed genes while controlling false discoveries. We demonstrate that PVAC exhibits equal or better performance than several widely used filtering methods. Furthermore, a data-driven approach that guides the selection of the filtering threshold value is also proposed. Oxford University Press 2011-07 2011-04-27 /pmc/articles/PMC3141272/ /pubmed/21525126 http://dx.doi.org/10.1093/nar/gkr241 Text en Published by Oxford University Press 2011. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Lu, Jun
Kerns, Robnet T.
Peddada, Shyamal D.
Bushel, Pierre R.
Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title_full Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title_fullStr Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title_full_unstemmed Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title_short Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
title_sort principal component analysis-based filtering improves detection for affymetrix gene expression arrays
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3141272/
https://www.ncbi.nlm.nih.gov/pubmed/21525126
http://dx.doi.org/10.1093/nar/gkr241
work_keys_str_mv AT lujun principalcomponentanalysisbasedfilteringimprovesdetectionforaffymetrixgeneexpressionarrays
AT kernsrobnett principalcomponentanalysisbasedfilteringimprovesdetectionforaffymetrixgeneexpressionarrays
AT peddadashyamald principalcomponentanalysisbasedfilteringimprovesdetectionforaffymetrixgeneexpressionarrays
AT bushelpierrer principalcomponentanalysisbasedfilteringimprovesdetectionforaffymetrixgeneexpressionarrays