Cargando…
A Guide for Sparse PCA: Model Comparison and Applications
PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify th...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8636462/ https://www.ncbi.nlm.nih.gov/pubmed/34185214 http://dx.doi.org/10.1007/s11336-021-09773-2 |
_version_ | 1784608532948058112 |
---|---|
author | Guerra-Urzola, Rosember Van Deun, Katrijn Vera, Juan C. Sijtsma, Klaas |
author_facet | Guerra-Urzola, Rosember Van Deun, Katrijn Vera, Juan C. Sijtsma, Klaas |
author_sort | Guerra-Urzola, Rosember |
collection | PubMed |
description | PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11336-021-09773-2. |
format | Online Article Text |
id | pubmed-8636462 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-86364622021-12-03 A Guide for Sparse PCA: Model Comparison and Applications Guerra-Urzola, Rosember Van Deun, Katrijn Vera, Juan C. Sijtsma, Klaas Psychometrika Application Reviews and Case Studies PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11336-021-09773-2. Springer US 2021-06-29 2021 /pmc/articles/PMC8636462/ /pubmed/34185214 http://dx.doi.org/10.1007/s11336-021-09773-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Application Reviews and Case Studies Guerra-Urzola, Rosember Van Deun, Katrijn Vera, Juan C. Sijtsma, Klaas A Guide for Sparse PCA: Model Comparison and Applications |
title | A Guide for Sparse PCA: Model Comparison and Applications |
title_full | A Guide for Sparse PCA: Model Comparison and Applications |
title_fullStr | A Guide for Sparse PCA: Model Comparison and Applications |
title_full_unstemmed | A Guide for Sparse PCA: Model Comparison and Applications |
title_short | A Guide for Sparse PCA: Model Comparison and Applications |
title_sort | guide for sparse pca: model comparison and applications |
topic | Application Reviews and Case Studies |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8636462/ https://www.ncbi.nlm.nih.gov/pubmed/34185214 http://dx.doi.org/10.1007/s11336-021-09773-2 |
work_keys_str_mv | AT guerraurzolarosember aguideforsparsepcamodelcomparisonandapplications AT vandeunkatrijn aguideforsparsepcamodelcomparisonandapplications AT verajuanc aguideforsparsepcamodelcomparisonandapplications AT sijtsmaklaas aguideforsparsepcamodelcomparisonandapplications AT guerraurzolarosember guideforsparsepcamodelcomparisonandapplications AT vandeunkatrijn guideforsparsepcamodelcomparisonandapplications AT verajuanc guideforsparsepcamodelcomparisonandapplications AT sijtsmaklaas guideforsparsepcamodelcomparisonandapplications |