Cargando…
Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology
A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the po...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330039/ https://www.ncbi.nlm.nih.gov/pubmed/32612205 http://dx.doi.org/10.1038/s41598-020-67846-1 |
_version_ | 1783553025484259328 |
---|---|
author | Bazaga, Adrián Leggate, Dan Weisser, Hendrik |
author_facet | Bazaga, Adrián Leggate, Dan Weisser, Hendrik |
author_sort | Bazaga, Adrián |
collection | PubMed |
description | A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the potential to transform target selection and validation. In this study we investigate how computational intelligence methods can be applied to predict novel therapeutic targets in oncology. We compared different machine learning classifiers applied to the task of drug target classification for nine different human cancer types. For each cancer type, a set of “known” target genes was obtained and equally-sized sets of “non-targets” were sampled multiple times from the human protein-coding genes. Models were trained on mutation, gene expression (TCGA), and gene essentiality (DepMap) data. In addition, we generated a numerical embedding of the interaction network of protein-coding genes using deep network representation learning and included the results in the modeling. We assessed feature importance using a random forests classifier and performed feature selection based on measuring permutation importance against a null distribution. Our best models achieved good generalization performance based on the AUROC metric. With the best model for each cancer type, we ran predictions on more than 15,000 protein-coding genes to identify potential novel targets. Our results indicate that this approach may be useful to inform early stages of the drug discovery pipeline. |
format | Online Article Text |
id | pubmed-7330039 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-73300392020-07-06 Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology Bazaga, Adrián Leggate, Dan Weisser, Hendrik Sci Rep Article A major cause of failed drug discovery programs is suboptimal target selection, resulting in the development of drug candidates that are potent inhibitors, but ineffective at treating the disease. In the genomics era, the availability of large biomedical datasets with genome-wide readouts has the potential to transform target selection and validation. In this study we investigate how computational intelligence methods can be applied to predict novel therapeutic targets in oncology. We compared different machine learning classifiers applied to the task of drug target classification for nine different human cancer types. For each cancer type, a set of “known” target genes was obtained and equally-sized sets of “non-targets” were sampled multiple times from the human protein-coding genes. Models were trained on mutation, gene expression (TCGA), and gene essentiality (DepMap) data. In addition, we generated a numerical embedding of the interaction network of protein-coding genes using deep network representation learning and included the results in the modeling. We assessed feature importance using a random forests classifier and performed feature selection based on measuring permutation importance against a null distribution. Our best models achieved good generalization performance based on the AUROC metric. With the best model for each cancer type, we ran predictions on more than 15,000 protein-coding genes to identify potential novel targets. Our results indicate that this approach may be useful to inform early stages of the drug discovery pipeline. Nature Publishing Group UK 2020-07-01 /pmc/articles/PMC7330039/ /pubmed/32612205 http://dx.doi.org/10.1038/s41598-020-67846-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Bazaga, Adrián Leggate, Dan Weisser, Hendrik Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title | Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title_full | Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title_fullStr | Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title_full_unstemmed | Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title_short | Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
title_sort | genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7330039/ https://www.ncbi.nlm.nih.gov/pubmed/32612205 http://dx.doi.org/10.1038/s41598-020-67846-1 |
work_keys_str_mv | AT bazagaadrian genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology AT leggatedan genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology AT weisserhendrik genomewideinvestigationofgenecancerassociationsforthepredictionofnoveltherapeutictargetsinoncology |