Cargando…

Biomedical generative pre-trained based transformer language model for age-related disease target discovery

Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advan...

Descripción completa

Detalles Bibliográficos
Autores principales: Zagirova, Diana, Pushkov, Stefan, Leung, Geoffrey Ho Duen, Liu, Bonnie Hei Man, Urban, Anatoly, Sidorenko, Denis, Kalashnikov, Aleksandr, Kozlova, Ekaterina, Naumov, Vladimir, Pun, Frank W., Ozerov, Ivan V., Aliper, Alex, Zhavoronkov, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Impact Journals 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564439/
https://www.ncbi.nlm.nih.gov/pubmed/37742294
http://dx.doi.org/10.18632/aging.205055
_version_ 1785118499259023360
author Zagirova, Diana
Pushkov, Stefan
Leung, Geoffrey Ho Duen
Liu, Bonnie Hei Man
Urban, Anatoly
Sidorenko, Denis
Kalashnikov, Aleksandr
Kozlova, Ekaterina
Naumov, Vladimir
Pun, Frank W.
Ozerov, Ivan V.
Aliper, Alex
Zhavoronkov, Alex
author_facet Zagirova, Diana
Pushkov, Stefan
Leung, Geoffrey Ho Duen
Liu, Bonnie Hei Man
Urban, Anatoly
Sidorenko, Denis
Kalashnikov, Aleksandr
Kozlova, Ekaterina
Naumov, Vladimir
Pun, Frank W.
Ozerov, Ivan V.
Aliper, Alex
Zhavoronkov, Alex
author_sort Zagirova, Diana
collection PubMed
description Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advances in natural language processing have provided new avenues for predicting potential therapeutic targets for various diseases. Here, we present a novel approach for predicting therapeutic targets using a large language model (LLM). We trained a domain-specific BioGPT model on a large corpus of biomedical literature consisting of grant text and developed a pipeline for generating target prediction. Our study demonstrates that pre-training of the LLM model with task-specific texts improves its performance. Applying the developed pipeline, we retrieved prospective aging and age-related disease targets and showed that these proteins are in correspondence with the database data. Moreover, we propose CCR5 and PTH as potential novel dual-purpose anti-aging and disease targets which were not previously identified as age-related but were highly ranked in our approach. Overall, our work highlights the high potential of transformer models in novel target prediction and provides a roadmap for future integration of AI approaches for addressing the intricate challenges presented in the biomedical field.
format Online
Article
Text
id pubmed-10564439
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Impact Journals
record_format MEDLINE/PubMed
spelling pubmed-105644392023-10-11 Biomedical generative pre-trained based transformer language model for age-related disease target discovery Zagirova, Diana Pushkov, Stefan Leung, Geoffrey Ho Duen Liu, Bonnie Hei Man Urban, Anatoly Sidorenko, Denis Kalashnikov, Aleksandr Kozlova, Ekaterina Naumov, Vladimir Pun, Frank W. Ozerov, Ivan V. Aliper, Alex Zhavoronkov, Alex Aging (Albany NY) Research Paper Target discovery is crucial for the development of innovative therapeutics and diagnostics. However, current approaches often face limitations in efficiency, specificity, and scalability, necessitating the exploration of novel strategies for identifying and validating disease-relevant targets. Advances in natural language processing have provided new avenues for predicting potential therapeutic targets for various diseases. Here, we present a novel approach for predicting therapeutic targets using a large language model (LLM). We trained a domain-specific BioGPT model on a large corpus of biomedical literature consisting of grant text and developed a pipeline for generating target prediction. Our study demonstrates that pre-training of the LLM model with task-specific texts improves its performance. Applying the developed pipeline, we retrieved prospective aging and age-related disease targets and showed that these proteins are in correspondence with the database data. Moreover, we propose CCR5 and PTH as potential novel dual-purpose anti-aging and disease targets which were not previously identified as age-related but were highly ranked in our approach. Overall, our work highlights the high potential of transformer models in novel target prediction and provides a roadmap for future integration of AI approaches for addressing the intricate challenges presented in the biomedical field. Impact Journals 2023-09-22 /pmc/articles/PMC10564439/ /pubmed/37742294 http://dx.doi.org/10.18632/aging.205055 Text en Copyright: © 2023 Zagirova et al. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/3.0/) (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Paper
Zagirova, Diana
Pushkov, Stefan
Leung, Geoffrey Ho Duen
Liu, Bonnie Hei Man
Urban, Anatoly
Sidorenko, Denis
Kalashnikov, Aleksandr
Kozlova, Ekaterina
Naumov, Vladimir
Pun, Frank W.
Ozerov, Ivan V.
Aliper, Alex
Zhavoronkov, Alex
Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title_full Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title_fullStr Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title_full_unstemmed Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title_short Biomedical generative pre-trained based transformer language model for age-related disease target discovery
title_sort biomedical generative pre-trained based transformer language model for age-related disease target discovery
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564439/
https://www.ncbi.nlm.nih.gov/pubmed/37742294
http://dx.doi.org/10.18632/aging.205055
work_keys_str_mv AT zagirovadiana biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT pushkovstefan biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT leunggeoffreyhoduen biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT liubonnieheiman biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT urbananatoly biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT sidorenkodenis biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT kalashnikovaleksandr biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT kozlovaekaterina biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT naumovvladimir biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT punfrankw biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT ozerovivanv biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT aliperalex biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery
AT zhavoronkovalex biomedicalgenerativepretrainedbasedtransformerlanguagemodelforagerelateddiseasetargetdiscovery