Cargando…
Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach
In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLS...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4894907/ https://www.ncbi.nlm.nih.gov/pubmed/27375677 http://dx.doi.org/10.3389/fgene.2016.00102 |
_version_ | 1782435741266083840 |
---|---|
author | Grellmann, Claudia Neumann, Jane Bitzer, Sebastian Kovacs, Peter Tönjes, Anke Westlye, Lars T. Andreassen, Ole A. Stumvoll, Michael Villringer, Arno Horstmann, Annette |
author_facet | Grellmann, Claudia Neumann, Jane Bitzer, Sebastian Kovacs, Peter Tönjes, Anke Westlye, Lars T. Andreassen, Ole A. Stumvoll, Michael Villringer, Arno Horstmann, Annette |
author_sort | Grellmann, Claudia |
collection | PubMed |
description | In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLSC) is a frequently used method for multivariate multimodal data integration. It is, however, computationally expensive in applications involving large numbers of variables, as required, for example, in genetic neuroimaging. To handle high-dimensional problems, dimension reduction might be implemented as pre-processing step. We propose a new approach that incorporates Random Projection (RP) for dimensionality reduction into PLSC to efficiently solve high-dimensional multimodal problems like genotype-phenotype associations. We name our new method PLSC-RP. Using simulated and experimental data sets containing whole genome SNP measures as genotypes and whole brain neuroimaging measures as phenotypes, we demonstrate that PLSC-RP is drastically faster than traditional PLSC while providing statistically equivalent results. We also provide evidence that dimensionality reduction using RP is data type independent. Therefore, PLSC-RP opens up a wide range of possible applications. It can be used for any integrative analysis that combines information from multiple sources. |
format | Online Article Text |
id | pubmed-4894907 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-48949072016-07-01 Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach Grellmann, Claudia Neumann, Jane Bitzer, Sebastian Kovacs, Peter Tönjes, Anke Westlye, Lars T. Andreassen, Ole A. Stumvoll, Michael Villringer, Arno Horstmann, Annette Front Genet Genetics In recent years, the advent of great technological advances has produced a wealth of very high-dimensional data, and combining high-dimensional information from multiple sources is becoming increasingly important in an extending range of scientific disciplines. Partial Least Squares Correlation (PLSC) is a frequently used method for multivariate multimodal data integration. It is, however, computationally expensive in applications involving large numbers of variables, as required, for example, in genetic neuroimaging. To handle high-dimensional problems, dimension reduction might be implemented as pre-processing step. We propose a new approach that incorporates Random Projection (RP) for dimensionality reduction into PLSC to efficiently solve high-dimensional multimodal problems like genotype-phenotype associations. We name our new method PLSC-RP. Using simulated and experimental data sets containing whole genome SNP measures as genotypes and whole brain neuroimaging measures as phenotypes, we demonstrate that PLSC-RP is drastically faster than traditional PLSC while providing statistically equivalent results. We also provide evidence that dimensionality reduction using RP is data type independent. Therefore, PLSC-RP opens up a wide range of possible applications. It can be used for any integrative analysis that combines information from multiple sources. Frontiers Media S.A. 2016-06-07 /pmc/articles/PMC4894907/ /pubmed/27375677 http://dx.doi.org/10.3389/fgene.2016.00102 Text en Copyright © 2016 Grellmann, Neumann, Bitzer, Kovacs, Tönjes, Westlye, Andreassen, Stumvoll, Villringer and Horstmann. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Grellmann, Claudia Neumann, Jane Bitzer, Sebastian Kovacs, Peter Tönjes, Anke Westlye, Lars T. Andreassen, Ole A. Stumvoll, Michael Villringer, Arno Horstmann, Annette Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title | Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title_full | Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title_fullStr | Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title_full_unstemmed | Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title_short | Random Projection for Fast and Efficient Multivariate Correlation Analysis of High-Dimensional Data: A New Approach |
title_sort | random projection for fast and efficient multivariate correlation analysis of high-dimensional data: a new approach |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4894907/ https://www.ncbi.nlm.nih.gov/pubmed/27375677 http://dx.doi.org/10.3389/fgene.2016.00102 |
work_keys_str_mv | AT grellmannclaudia randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT neumannjane randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT bitzersebastian randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT kovacspeter randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT tonjesanke randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT westlyelarst randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT andreassenolea randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT stumvollmichael randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT villringerarno randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach AT horstmannannette randomprojectionforfastandefficientmultivariatecorrelationanalysisofhighdimensionaldataanewapproach |