Cargando…

Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion

Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful...

Descripción completa

Detalles Bibliográficos
Autores principales: Savage, Richard S., Yuan, Yinyin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4785962/
https://www.ncbi.nlm.nih.gov/pubmed/26998311
http://dx.doi.org/10.1098/rsos.140501
_version_ 1782420477038297088
author Savage, Richard S.
Yuan, Yinyin
author_facet Savage, Richard S.
Yuan, Yinyin
author_sort Savage, Richard S.
collection PubMed
description Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful patterns from these diverse data types. We present FusionGP, a method for combining heterogeneous data types designed specifically for predicting outcome of treatment and disease. FusionGP is a Gaussian process model that includes a generalization of feature selection for biomarker discovery, allowing for simultaneous, sparse feature selection across multiple data types. Importantly, it can accommodate highly nonlinear structure in the data, and automatically infers the optimal contribution from each input data type. FusionGP compares favourably to several popular classification methods, including the Random Forest classifier, a stepwise logistic regression model and the Support Vector Machine on single data types. By combining gene expression, copy number alteration and digital pathology image data in 119 estrogen receptor (ER)-negative and 345 ER-positive breast tumours, we aim to predict two important clinical outcomes: death and chemoinsensitivity. While gene expression data give the best predictive performance in the majority of cases, the digital pathology data are much better for predicting death in ER cases. Thus, FusionGP is a new tool for selecting informative features from heterogeneous data types and predicting treatment response and prognosis.
format Online
Article
Text
id pubmed-4785962
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher The Royal Society Publishing
record_format MEDLINE/PubMed
spelling pubmed-47859622016-03-18 Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion Savage, Richard S. Yuan, Yinyin R Soc Open Sci Genetics Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful patterns from these diverse data types. We present FusionGP, a method for combining heterogeneous data types designed specifically for predicting outcome of treatment and disease. FusionGP is a Gaussian process model that includes a generalization of feature selection for biomarker discovery, allowing for simultaneous, sparse feature selection across multiple data types. Importantly, it can accommodate highly nonlinear structure in the data, and automatically infers the optimal contribution from each input data type. FusionGP compares favourably to several popular classification methods, including the Random Forest classifier, a stepwise logistic regression model and the Support Vector Machine on single data types. By combining gene expression, copy number alteration and digital pathology image data in 119 estrogen receptor (ER)-negative and 345 ER-positive breast tumours, we aim to predict two important clinical outcomes: death and chemoinsensitivity. While gene expression data give the best predictive performance in the majority of cases, the digital pathology data are much better for predicting death in ER cases. Thus, FusionGP is a new tool for selecting informative features from heterogeneous data types and predicting treatment response and prognosis. The Royal Society Publishing 2016-02-10 /pmc/articles/PMC4785962/ /pubmed/26998311 http://dx.doi.org/10.1098/rsos.140501 Text en http://creativecommons.org/licenses/by/4.0/ © 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
spellingShingle Genetics
Savage, Richard S.
Yuan, Yinyin
Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title_full Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title_fullStr Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title_full_unstemmed Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title_short Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
title_sort predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4785962/
https://www.ncbi.nlm.nih.gov/pubmed/26998311
http://dx.doi.org/10.1098/rsos.140501
work_keys_str_mv AT savagerichards predictingchemoinsensitivityinbreastcancerwithomicsdigitalpathologydatafusion
AT yuanyinyin predictingchemoinsensitivityinbreastcancerwithomicsdigitalpathologydatafusion