Cargando…
OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features
Late-stage drug development failures are usually a consequence of ineffective targets. Thus, proper target identification is needed, which may be possible using computational approaches. The reason being, effective targets have disease-relevant biological functions, and omics data unveil the protein...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10117673/ https://www.ncbi.nlm.nih.gov/pubmed/37091791 http://dx.doi.org/10.3389/fgene.2023.1139626 |
_version_ | 1785028642495004672 |
---|---|
author | Thafar, Maha A. Albaradei, Somayah Uludag, Mahmut Alshahrani, Mona Gojobori, Takashi Essack, Magbubah Gao, Xin |
author_facet | Thafar, Maha A. Albaradei, Somayah Uludag, Mahmut Alshahrani, Mona Gojobori, Takashi Essack, Magbubah Gao, Xin |
author_sort | Thafar, Maha A. |
collection | PubMed |
description | Late-stage drug development failures are usually a consequence of ineffective targets. Thus, proper target identification is needed, which may be possible using computational approaches. The reason being, effective targets have disease-relevant biological functions, and omics data unveil the proteins involved in these functions. Also, properties that favor the existence of binding between drug and target are deducible from the protein’s amino acid sequence. In this work, we developed OncoRTT, a deep learning (DL)-based method for predicting novel therapeutic targets. OncoRTT is designed to reduce suboptimal target selection by identifying novel targets based on features of known effective targets using DL approaches. First, we created the “OncologyTT” datasets, which include genes/proteins associated with ten prevalent cancer types. Then, we generated three sets of features for all genes: omics features, the proteins’ amino-acid sequence BERT embeddings, and the integrated features to train and test the DL classifiers separately. The models achieved high prediction performances in terms of area under the curve (AUC), i.e., AUC greater than 0.88 for all cancer types, with a maximum of 0.95 for leukemia. Also, OncoRTT outperformed the state-of-the-art method using their data in five out of seven cancer types commonly assessed by both methods. Furthermore, OncoRTT predicts novel therapeutic targets using new test data related to the seven cancer types. We further corroborated these results with other validation evidence using the Open Targets Platform and a case study focused on the top-10 predicted therapeutic targets for lung cancer. |
format | Online Article Text |
id | pubmed-10117673 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-101176732023-04-21 OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features Thafar, Maha A. Albaradei, Somayah Uludag, Mahmut Alshahrani, Mona Gojobori, Takashi Essack, Magbubah Gao, Xin Front Genet Genetics Late-stage drug development failures are usually a consequence of ineffective targets. Thus, proper target identification is needed, which may be possible using computational approaches. The reason being, effective targets have disease-relevant biological functions, and omics data unveil the proteins involved in these functions. Also, properties that favor the existence of binding between drug and target are deducible from the protein’s amino acid sequence. In this work, we developed OncoRTT, a deep learning (DL)-based method for predicting novel therapeutic targets. OncoRTT is designed to reduce suboptimal target selection by identifying novel targets based on features of known effective targets using DL approaches. First, we created the “OncologyTT” datasets, which include genes/proteins associated with ten prevalent cancer types. Then, we generated three sets of features for all genes: omics features, the proteins’ amino-acid sequence BERT embeddings, and the integrated features to train and test the DL classifiers separately. The models achieved high prediction performances in terms of area under the curve (AUC), i.e., AUC greater than 0.88 for all cancer types, with a maximum of 0.95 for leukemia. Also, OncoRTT outperformed the state-of-the-art method using their data in five out of seven cancer types commonly assessed by both methods. Furthermore, OncoRTT predicts novel therapeutic targets using new test data related to the seven cancer types. We further corroborated these results with other validation evidence using the Open Targets Platform and a case study focused on the top-10 predicted therapeutic targets for lung cancer. Frontiers Media S.A. 2023-04-06 /pmc/articles/PMC10117673/ /pubmed/37091791 http://dx.doi.org/10.3389/fgene.2023.1139626 Text en Copyright © 2023 Thafar, Albaradei, Uludag, Alshahrani, Gojobori, Essack and Gao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Thafar, Maha A. Albaradei, Somayah Uludag, Mahmut Alshahrani, Mona Gojobori, Takashi Essack, Magbubah Gao, Xin OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title | OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title_full | OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title_fullStr | OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title_full_unstemmed | OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title_short | OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features |
title_sort | oncortt: predicting novel oncology-related therapeutic targets using bert embeddings and omics features |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10117673/ https://www.ncbi.nlm.nih.gov/pubmed/37091791 http://dx.doi.org/10.3389/fgene.2023.1139626 |
work_keys_str_mv | AT thafarmahaa oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT albaradeisomayah oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT uludagmahmut oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT alshahranimona oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT gojoboritakashi oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT essackmagbubah oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures AT gaoxin oncorttpredictingnoveloncologyrelatedtherapeutictargetsusingbertembeddingsandomicsfeatures |