Cargando…
Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biolo...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510774/ https://www.ncbi.nlm.nih.gov/pubmed/28747816 http://dx.doi.org/10.1177/1176935117718517 |
_version_ | 1783250224157818880 |
---|---|
author | Kaplan, Adam Lock, Eric F |
author_facet | Kaplan, Adam Lock, Eric F |
author_sort | Kaplan, Adam |
collection | PubMed |
description | Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict. |
format | Online Article Text |
id | pubmed-5510774 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-55107742017-07-26 Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival Kaplan, Adam Lock, Eric F Cancer Inform Methodology Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict. SAGE Publications 2017-07-11 /pmc/articles/PMC5510774/ /pubmed/28747816 http://dx.doi.org/10.1177/1176935117718517 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page(https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Methodology Kaplan, Adam Lock, Eric F Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title | Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title_full | Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title_fullStr | Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title_full_unstemmed | Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title_short | Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival |
title_sort | prediction with dimension reduction of multiple molecular data sources for patient survival |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510774/ https://www.ncbi.nlm.nih.gov/pubmed/28747816 http://dx.doi.org/10.1177/1176935117718517 |
work_keys_str_mv | AT kaplanadam predictionwithdimensionreductionofmultiplemoleculardatasourcesforpatientsurvival AT lockericf predictionwithdimensionreductionofmultiplemoleculardatasourcesforpatientsurvival |