Cargando…

Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer

Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach ba...

Descripción completa

Detalles Bibliográficos
Autores principales: Ravanmehr, Vida, Blau, Hannah, Cappelletti, Luca, Fontana, Tommaso, Carmody, Leigh, Coleman, Ben, George, Joshy, Reese, Justin, Joachimiak, Marcin, Bocci, Giovanni, Hansen, Peter, Bult, Carol, Rueter, Jens, Casiraghi, Elena, Valentini, Giorgio, Mungall, Christopher, Oprea, Tudor I, Robinson, Peter N
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652379/
https://www.ncbi.nlm.nih.gov/pubmed/34888523
http://dx.doi.org/10.1093/nargab/lqab113
_version_ 1784611583162318848
author Ravanmehr, Vida
Blau, Hannah
Cappelletti, Luca
Fontana, Tommaso
Carmody, Leigh
Coleman, Ben
George, Joshy
Reese, Justin
Joachimiak, Marcin
Bocci, Giovanni
Hansen, Peter
Bult, Carol
Rueter, Jens
Casiraghi, Elena
Valentini, Giorgio
Mungall, Christopher
Oprea, Tudor I
Robinson, Peter N
author_facet Ravanmehr, Vida
Blau, Hannah
Cappelletti, Luca
Fontana, Tommaso
Carmody, Leigh
Coleman, Ben
George, Joshy
Reese, Justin
Joachimiak, Marcin
Bocci, Giovanni
Hansen, Peter
Bult, Carol
Rueter, Jens
Casiraghi, Elena
Valentini, Giorgio
Mungall, Christopher
Oprea, Tudor I
Robinson, Peter N
author_sort Ravanmehr, Vida
collection PubMed
description Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy.
format Online
Article
Text
id pubmed-8652379
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86523792021-12-08 Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer Ravanmehr, Vida Blau, Hannah Cappelletti, Luca Fontana, Tommaso Carmody, Leigh Coleman, Ben George, Joshy Reese, Justin Joachimiak, Marcin Bocci, Giovanni Hansen, Peter Bult, Carol Rueter, Jens Casiraghi, Elena Valentini, Giorgio Mungall, Christopher Oprea, Tudor I Robinson, Peter N NAR Genom Bioinform Standard Article Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy. Oxford University Press 2021-12-08 /pmc/articles/PMC8652379/ /pubmed/34888523 http://dx.doi.org/10.1093/nargab/lqab113 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Article
Ravanmehr, Vida
Blau, Hannah
Cappelletti, Luca
Fontana, Tommaso
Carmody, Leigh
Coleman, Ben
George, Joshy
Reese, Justin
Joachimiak, Marcin
Bocci, Giovanni
Hansen, Peter
Bult, Carol
Rueter, Jens
Casiraghi, Elena
Valentini, Giorgio
Mungall, Christopher
Oprea, Tudor I
Robinson, Peter N
Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title_full Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title_fullStr Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title_full_unstemmed Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title_short Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
title_sort supervised learning with word embeddings derived from pubmed captures latent knowledge about protein kinases and cancer
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652379/
https://www.ncbi.nlm.nih.gov/pubmed/34888523
http://dx.doi.org/10.1093/nargab/lqab113
work_keys_str_mv AT ravanmehrvida supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT blauhannah supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT cappellettiluca supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT fontanatommaso supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT carmodyleigh supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT colemanben supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT georgejoshy supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT reesejustin supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT joachimiakmarcin supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT boccigiovanni supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT hansenpeter supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT bultcarol supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT rueterjens supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT casiraghielena supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT valentinigiorgio supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT mungallchristopher supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT opreatudori supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer
AT robinsonpetern supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer