Cargando…
Memory Efficient PCA Methods for Large Group ICA
Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimension...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4735350/ https://www.ncbi.nlm.nih.gov/pubmed/26869874 http://dx.doi.org/10.3389/fnins.2016.00017 |
_version_ | 1782413065750315008 |
---|---|
author | Rachakonda, Srinivas Silva, Rogers F. Liu, Jingyu Calhoun, Vince D. |
author_facet | Rachakonda, Srinivas Silva, Rogers F. Liu, Jingyu Calhoun, Vince D. |
author_sort | Rachakonda, Srinivas |
collection | PubMed |
description | Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA problems. Since the number of dataloads is not typically optimized, we extend one of these methods to compute PCA of very large datasets with a minimal number of dataloads. This method is coined multi power iteration (MPOWIT). The key idea behind MPOWIT is to estimate a subspace larger than the desired one, while checking for convergence of only the smaller subset of interest. The number of iterations is reduced considerably (as well as the number of dataloads), accelerating convergence without loss of accuracy. More importantly, in the proposed implementation of MPOWIT, the memory required for successful recovery of the group principal components becomes independent of the number of subjects analyzed. Highly efficient subsampled eigenvalue decomposition techniques are also introduced, furnishing excellent PCA subspace approximations that can be used for intelligent initialization of randomized methods such as MPOWIT. Together, these developments enable efficient estimation of accurate principal components, as we illustrate by solving a 1600-subject group-level PCA of fMRI with standard acquisition parameters, on a regular desktop computer with only 4 GB RAM, in just a few hours. MPOWIT is also highly scalable and could realistically solve group-level PCA of fMRI on thousands of subjects, or more, using standard hardware, limited only by time, not memory. Also, the MPOWIT algorithm is highly parallelizable, which would enable fast, distributed implementations ideal for big data analysis. Implications to other methods such as expectation maximization PCA (EM PCA) are also presented. Based on our results, general recommendations for efficient application of PCA methods are given according to problem size and available computational resources. MPOWIT and all other methods discussed here are implemented and readily available in the open source GIFT software. |
format | Online Article Text |
id | pubmed-4735350 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-47353502016-02-11 Memory Efficient PCA Methods for Large Group ICA Rachakonda, Srinivas Silva, Rogers F. Liu, Jingyu Calhoun, Vince D. Front Neurosci Neuroscience Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA problems. Since the number of dataloads is not typically optimized, we extend one of these methods to compute PCA of very large datasets with a minimal number of dataloads. This method is coined multi power iteration (MPOWIT). The key idea behind MPOWIT is to estimate a subspace larger than the desired one, while checking for convergence of only the smaller subset of interest. The number of iterations is reduced considerably (as well as the number of dataloads), accelerating convergence without loss of accuracy. More importantly, in the proposed implementation of MPOWIT, the memory required for successful recovery of the group principal components becomes independent of the number of subjects analyzed. Highly efficient subsampled eigenvalue decomposition techniques are also introduced, furnishing excellent PCA subspace approximations that can be used for intelligent initialization of randomized methods such as MPOWIT. Together, these developments enable efficient estimation of accurate principal components, as we illustrate by solving a 1600-subject group-level PCA of fMRI with standard acquisition parameters, on a regular desktop computer with only 4 GB RAM, in just a few hours. MPOWIT is also highly scalable and could realistically solve group-level PCA of fMRI on thousands of subjects, or more, using standard hardware, limited only by time, not memory. Also, the MPOWIT algorithm is highly parallelizable, which would enable fast, distributed implementations ideal for big data analysis. Implications to other methods such as expectation maximization PCA (EM PCA) are also presented. Based on our results, general recommendations for efficient application of PCA methods are given according to problem size and available computational resources. MPOWIT and all other methods discussed here are implemented and readily available in the open source GIFT software. Frontiers Media S.A. 2016-02-02 /pmc/articles/PMC4735350/ /pubmed/26869874 http://dx.doi.org/10.3389/fnins.2016.00017 Text en Copyright © 2016 Rachakonda, Silva, Liu and Calhoun. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Rachakonda, Srinivas Silva, Rogers F. Liu, Jingyu Calhoun, Vince D. Memory Efficient PCA Methods for Large Group ICA |
title | Memory Efficient PCA Methods for Large Group ICA |
title_full | Memory Efficient PCA Methods for Large Group ICA |
title_fullStr | Memory Efficient PCA Methods for Large Group ICA |
title_full_unstemmed | Memory Efficient PCA Methods for Large Group ICA |
title_short | Memory Efficient PCA Methods for Large Group ICA |
title_sort | memory efficient pca methods for large group ica |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4735350/ https://www.ncbi.nlm.nih.gov/pubmed/26869874 http://dx.doi.org/10.3389/fnins.2016.00017 |
work_keys_str_mv | AT rachakondasrinivas memoryefficientpcamethodsforlargegroupica AT silvarogersf memoryefficientpcamethodsforlargegroupica AT liujingyu memoryefficientpcamethodsforlargegroupica AT calhounvinced memoryefficientpcamethodsforlargegroupica |