Cargando…
Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion
Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Royal Society Publishing
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4785962/ https://www.ncbi.nlm.nih.gov/pubmed/26998311 http://dx.doi.org/10.1098/rsos.140501 |
_version_ | 1782420477038297088 |
---|---|
author | Savage, Richard S. Yuan, Yinyin |
author_facet | Savage, Richard S. Yuan, Yinyin |
author_sort | Savage, Richard S. |
collection | PubMed |
description | Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful patterns from these diverse data types. We present FusionGP, a method for combining heterogeneous data types designed specifically for predicting outcome of treatment and disease. FusionGP is a Gaussian process model that includes a generalization of feature selection for biomarker discovery, allowing for simultaneous, sparse feature selection across multiple data types. Importantly, it can accommodate highly nonlinear structure in the data, and automatically infers the optimal contribution from each input data type. FusionGP compares favourably to several popular classification methods, including the Random Forest classifier, a stepwise logistic regression model and the Support Vector Machine on single data types. By combining gene expression, copy number alteration and digital pathology image data in 119 estrogen receptor (ER)-negative and 345 ER-positive breast tumours, we aim to predict two important clinical outcomes: death and chemoinsensitivity. While gene expression data give the best predictive performance in the majority of cases, the digital pathology data are much better for predicting death in ER cases. Thus, FusionGP is a new tool for selecting informative features from heterogeneous data types and predicting treatment response and prognosis. |
format | Online Article Text |
id | pubmed-4785962 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | The Royal Society Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-47859622016-03-18 Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion Savage, Richard S. Yuan, Yinyin R Soc Open Sci Genetics Predicting response to treatment and disease-specific deaths are key tasks in cancer research yet there is a lack of methodologies to achieve these. Large-scale ’omics and digital pathology technologies have led to the need for effective statistical methods for data fusion to extract the most useful patterns from these diverse data types. We present FusionGP, a method for combining heterogeneous data types designed specifically for predicting outcome of treatment and disease. FusionGP is a Gaussian process model that includes a generalization of feature selection for biomarker discovery, allowing for simultaneous, sparse feature selection across multiple data types. Importantly, it can accommodate highly nonlinear structure in the data, and automatically infers the optimal contribution from each input data type. FusionGP compares favourably to several popular classification methods, including the Random Forest classifier, a stepwise logistic regression model and the Support Vector Machine on single data types. By combining gene expression, copy number alteration and digital pathology image data in 119 estrogen receptor (ER)-negative and 345 ER-positive breast tumours, we aim to predict two important clinical outcomes: death and chemoinsensitivity. While gene expression data give the best predictive performance in the majority of cases, the digital pathology data are much better for predicting death in ER cases. Thus, FusionGP is a new tool for selecting informative features from heterogeneous data types and predicting treatment response and prognosis. The Royal Society Publishing 2016-02-10 /pmc/articles/PMC4785962/ /pubmed/26998311 http://dx.doi.org/10.1098/rsos.140501 Text en http://creativecommons.org/licenses/by/4.0/ © 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. |
spellingShingle | Genetics Savage, Richard S. Yuan, Yinyin Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title | Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title_full | Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title_fullStr | Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title_full_unstemmed | Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title_short | Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
title_sort | predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4785962/ https://www.ncbi.nlm.nih.gov/pubmed/26998311 http://dx.doi.org/10.1098/rsos.140501 |
work_keys_str_mv | AT savagerichards predictingchemoinsensitivityinbreastcancerwithomicsdigitalpathologydatafusion AT yuanyinyin predictingchemoinsensitivityinbreastcancerwithomicsdigitalpathologydatafusion |