Cargando…
Biomedical generative pre-trained based transformer language model for age-related disease target discovery
Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advan...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Impact Journals
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564439/ https://www.ncbi.nlm.nih.gov/pubmed/37742294 http://dx.doi.org/10.18632/aging.205055 |
_version_ | 1785118499259023360 |
---|---|
author | Zagirova, Diana Pushkov, Stefan Leung, Geoffrey Ho Duen Liu, Bonnie Hei Man Urban, Anatoly Sidorenko, Denis Kalashnikov, Aleksandr Kozlova, Ekaterina Naumov, Vladimir Pun, Frank W. Ozerov, Ivan V. Aliper, Alex Zhavoronkov, Alex |
author_facet | Zagirova, Diana Pushkov, Stefan Leung, Geoffrey Ho Duen Liu, Bonnie Hei Man Urban, Anatoly Sidorenko, Denis Kalashnikov, Aleksandr Kozlova, Ekaterina Naumov, Vladimir Pun, Frank W. Ozerov, Ivan V. Aliper, Alex Zhavoronkov, Alex |
author_sort | Zagirova, Diana |
collection | PubMed |
description | Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advances in natural language processing have provided new avenues for predicting potential therapeutic targets for various diseases. Here, we present a novel approach for predicting therapeutic targets using a large language model (LLM). We trained a domain-specific BioGPT model on a large corpus of biomedical literature consisting of grant text and developed a pipeline for generating target prediction. Our study demonstrates that pre-training of the LLM model with task-specific texts improves its performance. Applying the developed pipeline, we retrieved prospective aging and age-related disease targets and showed that these proteins are in correspondence with the database data. Moreover, we propose CCR5 and PTH as potential novel dual-purpose anti-aging and disease targets which were not previously identified as age-related but were highly ranked in our approach. Overall, our work highlights the high potential of transformer models in novel target prediction and provides a roadmap for future integration of AI approaches for addressing the intricate challenges presented in the biomedical field. |
format | Online Article Text |
id | pubmed-10564439 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Impact Journals |
record_format | MEDLINE/PubMed |
spelling | pubmed-105644392023-10-11 Biomedical generative pre-trained based transformer language model for age-related disease target discovery Zagirova, Diana Pushkov, Stefan Leung, Geoffrey Ho Duen Liu, Bonnie Hei Man Urban, Anatoly Sidorenko, Denis Kalashnikov, Aleksandr Kozlova, Ekaterina Naumov, Vladimir Pun, Frank W. Ozerov, Ivan V. Aliper, Alex Zhavoronkov, Alex Aging (Albany NY) Research Paper Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advances in natural language processing have provided new avenues for predicting potential therapeutic targets for various diseases. Here, we present a novel approach for predicting therapeutic targets using a large language model (LLM). We trained a domain-specific BioGPT model on a large corpus of biomedical literature consisting of grant text and developed a pipeline for generating target prediction. Our study demonstrates that pre-training of the LLM model with task-specific texts improves its performance. Applying the developed pipeline, we retrieved prospective aging and age-related disease targets and showed that these proteins are in correspondence with the database data. Moreover, we propose CCR5 and PTH as potential novel dual-purpose anti-aging and disease targets which were not previously identified as age-related but were highly ranked in our approach. Overall, our work highlights the high potential of transformer models in novel target prediction and provides a roadmap for future integration of AI approaches for addressing the intricate challenges presented in the biomedical field. Impact Journals 2023-09-22 /pmc/articles/PMC10564439/ /pubmed/37742294 http://dx.doi.org/10.18632/aging.205055 Text en Copyright: © 2023 Zagirova et al. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/3.0/) (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Paper Zagirova, Diana Pushkov, Stefan Leung, Geoffrey Ho Duen Liu, Bonnie Hei Man Urban, Anatoly Sidorenko, Denis Kalashnikov, Aleksandr Kozlova, Ekaterina Naumov, Vladimir Pun, Frank W. Ozerov, Ivan V. Aliper, Alex Zhavoronkov, Alex Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title | Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title_full | Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title_fullStr | Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title_full_unstemmed | Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title_short | Biomedical generative pre-trained based transformer language model for age-related disease target discovery |
title_sort | biomedical generative pre-trained based transformer language model for age-related disease target discovery |
topic | Research Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564439/ https://www.ncbi.nlm.nih.gov/pubmed/37742294 http://dx.doi.org/10.18632/aging.205055 |
work_keys_str_mv | AT zagirovadiana biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT pushkovstefan biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT leunggeoffreyhoduen biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT liubonnieheiman biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT urbananatoly biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT sidorenkodenis biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT kalashnikovaleksandr biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT kozlovaekaterina biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT naumovvladimir biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT punfrankw biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT ozerovivanv biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT aliperalex biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery AT zhavoronkovalex biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery |