Cargando…
Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data
Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell t...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890592/ https://www.ncbi.nlm.nih.gov/pubmed/27254731 http://dx.doi.org/10.1038/srep25696 |
_version_ | 1782435129879166976 |
---|---|
author | Lenz, Michael Müller, Franz-Josef Zenke, Martin Schuppert, Andreas |
author_facet | Lenz, Michael Müller, Franz-Josef Zenke, Martin Schuppert, Andreas |
author_sort | Lenz, Michael |
collection | PubMed |
description | Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression. Here, we reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, we analyze in which cases PCA fails to detect biologically relevant information and point the reader to methods that overcome these limitations. Our results refine the current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal. |
format | Online Article Text |
id | pubmed-4890592 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-48905922016-06-09 Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data Lenz, Michael Müller, Franz-Josef Zenke, Martin Schuppert, Andreas Sci Rep Article Principal components analysis (PCA) is a common unsupervised method for the analysis of gene expression microarray data, providing information on the overall structure of the analyzed dataset. In the recent years, it has been applied to very large datasets involving many different tissues and cell types, in order to create a low dimensional global map of human gene expression. Here, we reevaluate this approach and show that the linear intrinsic dimensionality of this global map is higher than previously reported. Furthermore, we analyze in which cases PCA fails to detect biologically relevant information and point the reader to methods that overcome these limitations. Our results refine the current understanding of the overall structure of gene expression spaces and show that PCA critically depends on the effect size of the biological signal as well as on the fraction of samples containing this signal. Nature Publishing Group 2016-06-02 /pmc/articles/PMC4890592/ /pubmed/27254731 http://dx.doi.org/10.1038/srep25696 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. βTo view βa copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Lenz, Michael Müller, Franz-Josef Zenke, Martin Schuppert, Andreas Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title | Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title_full | Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title_fullStr | Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title_full_unstemmed | Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title_short | Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
title_sort | principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4890592/ https://www.ncbi.nlm.nih.gov/pubmed/27254731 http://dx.doi.org/10.1038/srep25696 |
work_keys_str_mv | AT lenzmichael principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata AT mullerfranzjosef principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata AT zenkemartin principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata AT schuppertandreas principalcomponentsanalysisandthereportedlowintrinsicdimensionalityofgeneexpressionmicroarraydata |