Cargando…
Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer
Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach ba...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652379/ https://www.ncbi.nlm.nih.gov/pubmed/34888523 http://dx.doi.org/10.1093/nargab/lqab113 |
_version_ | 1784611583162318848 |
---|---|
author | Ravanmehr, Vida Blau, Hannah Cappelletti, Luca Fontana, Tommaso Carmody, Leigh Coleman, Ben George, Joshy Reese, Justin Joachimiak, Marcin Bocci, Giovanni Hansen, Peter Bult, Carol Rueter, Jens Casiraghi, Elena Valentini, Giorgio Mungall, Christopher Oprea, Tudor I Robinson, Peter N |
author_facet | Ravanmehr, Vida Blau, Hannah Cappelletti, Luca Fontana, Tommaso Carmody, Leigh Coleman, Ben George, Joshy Reese, Justin Joachimiak, Marcin Bocci, Giovanni Hansen, Peter Bult, Carol Rueter, Jens Casiraghi, Elena Valentini, Giorgio Mungall, Christopher Oprea, Tudor I Robinson, Peter N |
author_sort | Ravanmehr, Vida |
collection | PubMed |
description | Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy. |
format | Online Article Text |
id | pubmed-8652379 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-86523792021-12-08 Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer Ravanmehr, Vida Blau, Hannah Cappelletti, Luca Fontana, Tommaso Carmody, Leigh Coleman, Ben George, Joshy Reese, Justin Joachimiak, Marcin Bocci, Giovanni Hansen, Peter Bult, Carol Rueter, Jens Casiraghi, Elena Valentini, Giorgio Mungall, Christopher Oprea, Tudor I Robinson, Peter N NAR Genom Bioinform Standard Article Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy. Oxford University Press 2021-12-08 /pmc/articles/PMC8652379/ /pubmed/34888523 http://dx.doi.org/10.1093/nargab/lqab113 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Standard Article Ravanmehr, Vida Blau, Hannah Cappelletti, Luca Fontana, Tommaso Carmody, Leigh Coleman, Ben George, Joshy Reese, Justin Joachimiak, Marcin Bocci, Giovanni Hansen, Peter Bult, Carol Rueter, Jens Casiraghi, Elena Valentini, Giorgio Mungall, Christopher Oprea, Tudor I Robinson, Peter N Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title | Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title_full | Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title_fullStr | Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title_full_unstemmed | Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title_short | Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer |
title_sort | supervised learning with word embeddings derived from pubmed captures latent knowledge about protein kinases and cancer |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652379/ https://www.ncbi.nlm.nih.gov/pubmed/34888523 http://dx.doi.org/10.1093/nargab/lqab113 |
work_keys_str_mv | AT ravanmehrvida supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT blauhannah supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT cappellettiluca supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT fontanatommaso supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT carmodyleigh supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT colemanben supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT georgejoshy supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT reesejustin supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT joachimiakmarcin supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT boccigiovanni supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT hansenpeter supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT bultcarol supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT rueterjens supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT casiraghielena supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT valentinigiorgio supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT mungallchristopher supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT opreatudori supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer AT robinsonpetern supervisedlearningwithwordembeddingsderivedfrompubmedcaptureslatentknowledgeaboutproteinkinasesandcancer |