Cargando…

Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival

Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biolo...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaplan, Adam, Lock, Eric F
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510774/
https://www.ncbi.nlm.nih.gov/pubmed/28747816
http://dx.doi.org/10.1177/1176935117718517
_version_ 1783250224157818880
author Kaplan, Adam
Lock, Eric F
author_facet Kaplan, Adam
Lock, Eric F
author_sort Kaplan, Adam
collection PubMed
description Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict.
format Online
Article
Text
id pubmed-5510774
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-55107742017-07-26 Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival Kaplan, Adam Lock, Eric F Cancer Inform Methodology Predictive modeling from high-dimensional genomic data is often preceded by a dimension reduction step, such as principal component analysis (PCA). However, the application of PCA is not straightforward for multisource data, wherein multiple sources of ‘omics data measure different but related biological components. In this article, we use recent advances in the dimension reduction of multisource data for predictive modeling. In particular, we apply exploratory results from Joint and Individual Variation Explained (JIVE), an extension of PCA for multisource data, for prediction of differing response types. We conduct illustrative simulations to illustrate the practical advantages and interpretability of our approach. As an application example, we consider predicting survival for patients with glioblastoma multiforme from 3 data sources measuring messenger RNA expression, microRNA expression, and DNA methylation. We also introduce a method to estimate JIVE scores for new samples that were not used in the initial dimension reduction and study its theoretical properties; this method is implemented in the R package R.JIVE on CRAN, in the function jive.predict. SAGE Publications 2017-07-11 /pmc/articles/PMC5510774/ /pubmed/28747816 http://dx.doi.org/10.1177/1176935117718517 Text en © The Author(s) 2017 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page(https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Methodology
Kaplan, Adam
Lock, Eric F
Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title_full Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title_fullStr Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title_full_unstemmed Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title_short Prediction With Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
title_sort prediction with dimension reduction of multiple molecular data sources for patient survival
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5510774/
https://www.ncbi.nlm.nih.gov/pubmed/28747816
http://dx.doi.org/10.1177/1176935117718517
work_keys_str_mv AT kaplanadam predictionwithdimensionreductionofmultiplemoleculardatasourcesforpatientsurvival
AT lockericf predictionwithdimensionreductionofmultiplemoleculardatasourcesforpatientsurvival