Cargando…

Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery

BACKGROUND: Multimodal data, especially imaging and non-imaging data, is being routinely acquired in the context of disease diagnostics; however, computational challenges have limited the ability to quantitatively integrate imaging and non-imaging data channels with different dimensionalities and sc...

Descripción completa

Detalles Bibliográficos
Autores principales: Golugula, Abhishek, Lee, George, Master, Stephen R, Feldman, Michael D, Tomaszewski, John E, Speicher, David W, Madabhushi, Anant
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267835/
https://www.ncbi.nlm.nih.gov/pubmed/22182303
http://dx.doi.org/10.1186/1471-2105-12-483
_version_ 1782222331266990080
author Golugula, Abhishek
Lee, George
Master, Stephen R
Feldman, Michael D
Tomaszewski, John E
Speicher, David W
Madabhushi, Anant
author_facet Golugula, Abhishek
Lee, George
Master, Stephen R
Feldman, Michael D
Tomaszewski, John E
Speicher, David W
Madabhushi, Anant
author_sort Golugula, Abhishek
collection PubMed
description BACKGROUND: Multimodal data, especially imaging and non-imaging data, is being routinely acquired in the context of disease diagnostics; however, computational challenges have limited the ability to quantitatively integrate imaging and non-imaging data channels with different dimensionalities and scales. To the best of our knowledge relatively few attempts have been made to quantitatively fuse such data to construct classifiers and none have attempted to quantitatively combine histology (imaging) and proteomic (non-imaging) measurements for making diagnostic and prognostic predictions. The objective of this work is to create a common subspace to simultaneously accommodate both the imaging and non-imaging data (and hence data corresponding to different scales and dimensionalities), called a metaspace. This metaspace can be used to build a meta-classifier that produces better classification results than a classifier that is based on a single modality alone. Canonical Correlation Analysis (CCA) and Regularized CCA (RCCA) are statistical techniques that extract correlations between two modes of data to construct a homogeneous, uniform representation of heterogeneous data channels. In this paper, we present a novel modification to CCA and RCCA, Supervised Regularized Canonical Correlation Analysis (SRCCA), that (1) enables the quantitative integration of data from multiple modalities using a feature selection scheme, (2) is regularized, and (3) is computationally cheap. We leverage this SRCCA framework towards the fusion of proteomic and histologic image signatures for identifying prostate cancer patients at the risk of 5 year biochemical recurrence following radical prostatectomy. RESULTS: A cohort of 19 grade, stage matched prostate cancer patients, all of whom had radical prostatectomy, including 10 of whom had biochemical recurrence within 5 years of surgery and 9 of whom did not, were considered in this study. The aim was to construct a lower fused dimensional metaspace comprising both the histological and proteomic measurements obtained from the site of the dominant nodule on the surgical specimen. In conjunction with SRCCA, a random forest classifier was able to identify prostate cancer patients, who developed biochemical recurrence within 5 years, with a maximum classification accuracy of 93%. CONCLUSIONS: The classifier performance in the SRCCA space was found to be statistically significantly higher compared to the fused data representations obtained, not only from CCA and RCCA, but also two other statistical techniques called Principal Component Analysis and Partial Least Squares Regression. These results suggest that SRCCA is a computationally efficient and a highly accurate scheme for representing multimodal (histologic and proteomic) data in a metaspace and that it could be used to construct fused biomarkers for predicting disease recurrence and prognosis.
format Online
Article
Text
id pubmed-3267835
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32678352012-01-30 Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery Golugula, Abhishek Lee, George Master, Stephen R Feldman, Michael D Tomaszewski, John E Speicher, David W Madabhushi, Anant BMC Bioinformatics Methodology Article BACKGROUND: Multimodal data, especially imaging and non-imaging data, is being routinely acquired in the context of disease diagnostics; however, computational challenges have limited the ability to quantitatively integrate imaging and non-imaging data channels with different dimensionalities and scales. To the best of our knowledge relatively few attempts have been made to quantitatively fuse such data to construct classifiers and none have attempted to quantitatively combine histology (imaging) and proteomic (non-imaging) measurements for making diagnostic and prognostic predictions. The objective of this work is to create a common subspace to simultaneously accommodate both the imaging and non-imaging data (and hence data corresponding to different scales and dimensionalities), called a metaspace. This metaspace can be used to build a meta-classifier that produces better classification results than a classifier that is based on a single modality alone. Canonical Correlation Analysis (CCA) and Regularized CCA (RCCA) are statistical techniques that extract correlations between two modes of data to construct a homogeneous, uniform representation of heterogeneous data channels. In this paper, we present a novel modification to CCA and RCCA, Supervised Regularized Canonical Correlation Analysis (SRCCA), that (1) enables the quantitative integration of data from multiple modalities using a feature selection scheme, (2) is regularized, and (3) is computationally cheap. We leverage this SRCCA framework towards the fusion of proteomic and histologic image signatures for identifying prostate cancer patients at the risk of 5 year biochemical recurrence following radical prostatectomy. RESULTS: A cohort of 19 grade, stage matched prostate cancer patients, all of whom had radical prostatectomy, including 10 of whom had biochemical recurrence within 5 years of surgery and 9 of whom did not, were considered in this study. The aim was to construct a lower fused dimensional metaspace comprising both the histological and proteomic measurements obtained from the site of the dominant nodule on the surgical specimen. In conjunction with SRCCA, a random forest classifier was able to identify prostate cancer patients, who developed biochemical recurrence within 5 years, with a maximum classification accuracy of 93%. CONCLUSIONS: The classifier performance in the SRCCA space was found to be statistically significantly higher compared to the fused data representations obtained, not only from CCA and RCCA, but also two other statistical techniques called Principal Component Analysis and Partial Least Squares Regression. These results suggest that SRCCA is a computationally efficient and a highly accurate scheme for representing multimodal (histologic and proteomic) data in a metaspace and that it could be used to construct fused biomarkers for predicting disease recurrence and prognosis. BioMed Central 2011-12-19 /pmc/articles/PMC3267835/ /pubmed/22182303 http://dx.doi.org/10.1186/1471-2105-12-483 Text en Copyright ©2011 Golugula et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Golugula, Abhishek
Lee, George
Master, Stephen R
Feldman, Michael D
Tomaszewski, John E
Speicher, David W
Madabhushi, Anant
Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title_full Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title_fullStr Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title_full_unstemmed Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title_short Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
title_sort supervised regularized canonical correlation analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3267835/
https://www.ncbi.nlm.nih.gov/pubmed/22182303
http://dx.doi.org/10.1186/1471-2105-12-483
work_keys_str_mv AT golugulaabhishek supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT leegeorge supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT masterstephenr supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT feldmanmichaeld supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT tomaszewskijohne supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT speicherdavidw supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery
AT madabhushianant supervisedregularizedcanonicalcorrelationanalysisintegratinghistologicandproteomicmeasurementsforpredictingbiochemicalrecurrencefollowingprostatesurgery