Cargando…

Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data

Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell t...

Descripción completa

Detalles Bibliográficos
Autores principales: Lenz, Michael, Müller, Franz-Josef, Zenke, Martin, Schuppert, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890592/
https://www.ncbi.nlm.nih.gov/pubmed/27254731
http://dx.doi.org/10.1038/srep25696
_version_ 1782435129879166976
author Lenz, Michael
Müller, Franz-Josef
Zenke, Martin
Schuppert, Andreas
author_facet Lenz, Michael
Müller, Franz-Josef
Zenke, Martin
Schuppert, Andreas
author_sort Lenz, Michael
collection PubMed
description Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression. Here, we reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, we analyze in which cases PCA fails to detect biologically relevant information and point the reader to methods that overcome these limitations. Our results refine the current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal.
format Online
Article
Text
id pubmed-4890592
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-48905922016-06-09 Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data Lenz, Michael Müller, Franz-Josef Zenke, Martin Schuppert, Andreas Sci Rep Article Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression. Here, we reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, we analyze in which cases PCA fails to detect biologically relevant information and point the reader to methods that overcome these limitations. Our results refine the current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal. Nature Publishing Group 2016-06-02 /pmc/articles/PMC4890592/ /pubmed/27254731 http://dx.doi.org/10.1038/srep25696 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. βTo view βa copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Lenz, Michael
Müller, Franz-Josef
Zenke, Martin
Schuppert, Andreas
Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title_full Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title_fullStr Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title_full_unstemmed Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title_short Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
title_sort principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890592/
https://www.ncbi.nlm.nih.gov/pubmed/27254731
http://dx.doi.org/10.1038/srep25696
work_keys_str_mv AT lenzmichael principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata
AT mullerfranzjosef principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata
AT zenkemartin principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata
AT schuppertandreas principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata