Cargando…

Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology

A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the po...

Descripción completa

Detalles Bibliográficos
Autores principales: Bazaga, Adrián, Leggate, Dan, Weisser, Hendrik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330039/
https://www.ncbi.nlm.nih.gov/pubmed/32612205
http://dx.doi.org/10.1038/s41598-020-67846-1
_version_ 1783553025484259328
author Bazaga, Adrián
Leggate, Dan
Weisser, Hendrik
author_facet Bazaga, Adrián
Leggate, Dan
Weisser, Hendrik
author_sort Bazaga, Adrián
collection PubMed
description A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the potential to transform target selection and validation. In this study we investigate how computational intelligence methods can be applied to predict novel therapeutic targets in oncology. We compared different machine learning classifiers applied to the task of drug target classification for nine different human cancer types. For each cancer type, a set of “known” target genes was obtained and equally-sized sets of “non-targets” were sampled multiple times from the human protein-coding genes. Models were trained on mutation, gene expression (TCGA), and gene essentiality (DepMap) data. In addition, we generated a numerical embedding of the interaction network of protein-coding genes using deep network representation learning and included the results in the modeling. We assessed feature importance using a random forests classifier and performed feature selection based on measuring permutation importance against a null distribution. Our best models achieved good generalization performance based on the AUROC metric. With the best model for each cancer type, we ran predictions on more than 15,000 protein-coding genes to identify potential novel targets. Our results indicate that this approach may be useful to inform early stages of the drug discovery pipeline.
format Online
Article
Text
id pubmed-7330039
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73300392020-07-06 Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology Bazaga, Adrián Leggate, Dan Weisser, Hendrik Sci Rep Article A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the potential to transform target selection and validation. In this study we investigate how computational intelligence methods can be applied to predict novel therapeutic targets in oncology. We compared different machine learning classifiers applied to the task of drug target classification for nine different human cancer types. For each cancer type, a set of “known” target genes was obtained and equally-sized sets of “non-targets” were sampled multiple times from the human protein-coding genes. Models were trained on mutation, gene expression (TCGA), and gene essentiality (DepMap) data. In addition, we generated a numerical embedding of the interaction network of protein-coding genes using deep network representation learning and included the results in the modeling. We assessed feature importance using a random forests classifier and performed feature selection based on measuring permutation importance against a null distribution. Our best models achieved good generalization performance based on the AUROC metric. With the best model for each cancer type, we ran predictions on more than 15,000 protein-coding genes to identify potential novel targets. Our results indicate that this approach may be useful to inform early stages of the drug discovery pipeline. Nature Publishing Group UK 2020-07-01 /pmc/articles/PMC7330039/ /pubmed/32612205 http://dx.doi.org/10.1038/s41598-020-67846-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Bazaga, Adrián
Leggate, Dan
Weisser, Hendrik
Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title_full Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title_fullStr Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title_full_unstemmed Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title_short Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
title_sort genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330039/
https://www.ncbi.nlm.nih.gov/pubmed/32612205
http://dx.doi.org/10.1038/s41598-020-67846-1
work_keys_str_mv AT bazagaadrian genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology
AT leggatedan genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology
AT weisserhendrik genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology