Cargando…

Comparison of pathway and gene-level models for cancer prognosis prediction

BACKGROUND: Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biologica...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zheng, Xingyu, Amos, Christopher I., Frost, H. Robert
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048092/ https://www.ncbi.nlm.nih.gov/pubmed/32111152 http://dx.doi.org/10.1186/s12859-020-3423-z

_version_	1783502237199237120
author	Zheng, Xingyu Amos, Christopher I. Frost, H. Robert
author_facet	Zheng, Xingyu Amos, Christopher I. Frost, H. Robert
author_sort	Zheng, Xingyu
collection	PubMed
description	BACKGROUND: Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB). RESULTS: When analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~ 0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data. CONCLUSION: The results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency.
format	Online Article Text
id	pubmed-7048092
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-70480922020-03-05 Comparison of pathway and gene-level models for cancer prognosis prediction Zheng, Xingyu Amos, Christopher I. Frost, H. Robert BMC Bioinformatics Methodology Article BACKGROUND: Cancer prognosis prediction is valuable for patients and clinicians because it allows them to appropriately manage care. A promising direction for improving the performance and interpretation of expression-based predictive models involves the aggregation of gene-level data into biological pathways. While many studies have used pathway-level predictors for cancer survival analysis, a comprehensive comparison of pathway-level and gene-level prognostic models has not been performed. To address this gap, we characterized the performance of penalized Cox proportional hazard models built using either pathway- or gene-level predictors for the cancers profiled in The Cancer Genome Atlas (TCGA) and pathways from the Molecular Signatures Database (MSigDB). RESULTS: When analyzing TCGA data, we found that pathway-level models are more parsimonious, more robust, more computationally efficient and easier to interpret than gene-level models with similar predictive performance. For example, both pathway-level and gene-level models have an average Cox concordance index of ~ 0.85 for the TCGA glioma cohort, however, the gene-level model has twice as many predictors on average, the predictor composition is less stable across cross-validation folds and estimation takes 40 times as long as compared to the pathway-level model. When the complex correlation structure of the data is broken by permutation, the pathway-level model has greater predictive performance while still retaining superior interpretative power, robustness, parsimony and computational efficiency relative to the gene-level models. For example, the average concordance index of the pathway-level model increases to 0.88 while the gene-level model falls to 0.56 for the TCGA glioma cohort using survival times simulated from uncorrelated gene expression data. CONCLUSION: The results of this study show that when the correlations among gene expression values are low, pathway-level analyses can yield better predictive performance, greater interpretative power, more robust models and less computational cost relative to a gene-level model. When correlations among genes are high, a pathway-level analysis provides equivalent predictive power compared to a gene-level analysis while retaining the advantages of interpretability, robustness and computational efficiency. BioMed Central 2020-02-28 /pmc/articles/PMC7048092/ /pubmed/32111152 http://dx.doi.org/10.1186/s12859-020-3423-z Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Zheng, Xingyu Amos, Christopher I. Frost, H. Robert Comparison of pathway and gene-level models for cancer prognosis prediction
title	Comparison of pathway and gene-level models for cancer prognosis prediction
title_full	Comparison of pathway and gene-level models for cancer prognosis prediction
title_fullStr	Comparison of pathway and gene-level models for cancer prognosis prediction
title_full_unstemmed	Comparison of pathway and gene-level models for cancer prognosis prediction
title_short	Comparison of pathway and gene-level models for cancer prognosis prediction
title_sort	comparison of pathway and gene-level models for cancer prognosis prediction
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048092/ https://www.ncbi.nlm.nih.gov/pubmed/32111152 http://dx.doi.org/10.1186/s12859-020-3423-z
work_keys_str_mv	AT zhengxingyu comparisonofpathwayandgenelevelmodelsforcancerprognosisprediction AT amoschristopheri comparisonofpathwayandgenelevelmodelsforcancerprognosisprediction AT frosthrobert comparisonofpathwayandgenelevelmodelsforcancerprognosisprediction

Comparison of pathway and gene-level models for cancer prognosis prediction

Ejemplares similares