Cargando…
Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phe...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081343/ https://www.ncbi.nlm.nih.gov/pubmed/33861732 http://dx.doi.org/10.1371/journal.pcbi.1008878 |
_version_ | 1783685619706232832 |
---|---|
author | Uzunangelov, Vladislav Wong, Christopher K. Stuart, Joshua M. |
author_facet | Uzunangelov, Vladislav Wong, Christopher K. Stuart, Joshua M. |
author_sort | Uzunangelov, Vladislav |
collection | PubMed |
description | Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality. |
format | Online Article Text |
id | pubmed-8081343 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-80813432021-05-06 Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge Uzunangelov, Vladislav Wong, Christopher K. Stuart, Joshua M. PLoS Comput Biol Research Article Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality. Public Library of Science 2021-04-16 /pmc/articles/PMC8081343/ /pubmed/33861732 http://dx.doi.org/10.1371/journal.pcbi.1008878 Text en © 2021 Uzunangelov et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Uzunangelov, Vladislav Wong, Christopher K. Stuart, Joshua M. Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title | Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title_full | Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title_fullStr | Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title_full_unstemmed | Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title_short | Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
title_sort | accurate cancer phenotype prediction with aklimate, a stacked kernel learner integrating multimodal genomic data and pathway knowledge |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081343/ https://www.ncbi.nlm.nih.gov/pubmed/33861732 http://dx.doi.org/10.1371/journal.pcbi.1008878 |
work_keys_str_mv | AT uzunangelovvladislav accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge AT wongchristopherk accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge AT stuartjoshuam accuratecancerphenotypepredictionwithaklimateastackedkernellearnerintegratingmultimodalgenomicdataandpathwayknowledge |