Cargando…

Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge

Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phe...

Descripción completa

Detalles Bibliográficos
Autores principales: Uzunangelov, Vladislav, Wong, Christopher K., Stuart, Joshua M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081343/
https://www.ncbi.nlm.nih.gov/pubmed/33861732
http://dx.doi.org/10.1371/journal.pcbi.1008878
_version_ 1783685619706232832
author Uzunangelov, Vladislav
Wong, Christopher K.
Stuart, Joshua M.
author_facet Uzunangelov, Vladislav
Wong, Christopher K.
Stuart, Joshua M.
author_sort Uzunangelov, Vladislav
collection PubMed
description Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality.
format Online
Article
Text
id pubmed-8081343
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80813432021-05-06 Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge Uzunangelov, Vladislav Wong, Christopher K. Stuart, Joshua M. PLoS Comput Biol Research Article Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality. Public Library of Science 2021-04-16 /pmc/articles/PMC8081343/ /pubmed/33861732 http://dx.doi.org/10.1371/journal.pcbi.1008878 Text en © 2021 Uzunangelov et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Uzunangelov, Vladislav
Wong, Christopher K.
Stuart, Joshua M.
Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title_full Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title_fullStr Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title_full_unstemmed Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title_short Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
title_sort accurate cancer phenotype prediction with aklimate, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081343/
https://www.ncbi.nlm.nih.gov/pubmed/33861732
http://dx.doi.org/10.1371/journal.pcbi.1008878
work_keys_str_mv AT uzunangelovvladislav accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge
AT wongchristopherk accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge
AT stuartjoshuam accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge