Cargando…

KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens

Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enrichment....

Descripción completa

Detalles Bibliográficos
Autores principales: Sailem, Heba Z, Rittscher, Jens, Pelkmans, Lucas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059140/
https://www.ncbi.nlm.nih.gov/pubmed/32141232
http://dx.doi.org/10.15252/msb.20199083
_version_ 1783503986930745344
author Sailem, Heba Z
Rittscher, Jens
Pelkmans, Lucas
author_facet Sailem, Heba Z
Rittscher, Jens
Pelkmans, Lucas
author_sort Sailem, Heba Z
collection PubMed
description Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enrichment. We present Knowledge‐ and Context‐driven Machine Learning (KCML), a framework that systematically predicts multiple context‐specific functions for a given gene based on the similarity of its perturbation phenotype to those with known function. As a proof of concept, we test KCML on three datasets describing phenotypes at the molecular, cellular and population levels and show that it outperforms traditional analysis pipelines. In particular, KCML identified an abnormal multicellular organisation phenotype associated with the depletion of olfactory receptors, and TGFβ and WNT signalling genes in colorectal cancer cells. We validate these predictions in colorectal cancer patients and show that olfactory receptors expression is predictive of worse patient outcomes. These results highlight KCML as a systematic framework for discovering novel scale‐crossing and context‐dependent gene functions. KCML is highly generalisable and applicable to various large‐scale genetic perturbation screens.
format Online
Article
Text
id pubmed-7059140
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-70591402020-03-11 KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens Sailem, Heba Z Rittscher, Jens Pelkmans, Lucas Mol Syst Biol Methods Characterising context‐dependent gene functions is crucial for understanding the genetic bases of health and disease. To date, inference of gene functions from large‐scale genetic perturbation screens is based on ad hoc analysis pipelines involving unsupervised clustering and functional enrichment. We present Knowledge‐ and Context‐driven Machine Learning (KCML), a framework that systematically predicts multiple context‐specific functions for a given gene based on the similarity of its perturbation phenotype to those with known function. As a proof of concept, we test KCML on three datasets describing phenotypes at the molecular, cellular and population levels and show that it outperforms traditional analysis pipelines. In particular, KCML identified an abnormal multicellular organisation phenotype associated with the depletion of olfactory receptors, and TGFβ and WNT signalling genes in colorectal cancer cells. We validate these predictions in colorectal cancer patients and show that olfactory receptors expression is predictive of worse patient outcomes. These results highlight KCML as a systematic framework for discovering novel scale‐crossing and context‐dependent gene functions. KCML is highly generalisable and applicable to various large‐scale genetic perturbation screens. John Wiley and Sons Inc. 2020-03-06 /pmc/articles/PMC7059140/ /pubmed/32141232 http://dx.doi.org/10.15252/msb.20199083 Text en © 2020 The Authors. Published under the terms of the CC BY 4.0 license This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Sailem, Heba Z
Rittscher, Jens
Pelkmans, Lucas
KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_full KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_fullStr KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_full_unstemmed KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_short KCML: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
title_sort kcml: a machine‐learning framework for inference of multi‐scale gene functions from genetic perturbation screens
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7059140/
https://www.ncbi.nlm.nih.gov/pubmed/32141232
http://dx.doi.org/10.15252/msb.20199083
work_keys_str_mv AT sailemhebaz kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens
AT rittscherjens kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens
AT pelkmanslucas kcmlamachinelearningframeworkforinferenceofmultiscalegenefunctionsfromgeneticperturbationscreens