Cargando…

CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models

Large pre-trained language models (LLMs) have been shown to have significant potential in few-shot learning across various fields, even with minimal training data. However, their ability to generalize to unseen tasks in more complex fields, such as biology, has yet to be fully evaluated. LLMs can of...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Tianhao, Shetty, Sandesh, Kamath, Advaith, Jaiswal, Ajay, Jiang, Xiaoqian, Ding, Ying, Kim, Yejin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cornell University 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10153348/
https://www.ncbi.nlm.nih.gov/pubmed/37131872
_version_ 1785035912407678976
author Li, Tianhao
Shetty, Sandesh
Kamath, Advaith
Jaiswal, Ajay
Jiang, Xiaoqian
Ding, Ying
Kim, Yejin
author_facet Li, Tianhao
Shetty, Sandesh
Kamath, Advaith
Jaiswal, Ajay
Jiang, Xiaoqian
Ding, Ying
Kim, Yejin
author_sort Li, Tianhao
collection PubMed
description Large pre-trained language models (LLMs) have been shown to have significant potential in few-shot learning across various fields, even with minimal training data. However, their ability to generalize to unseen tasks in more complex fields, such as biology, has yet to be fully evaluated. LLMs can offer a promising alternative approach for biological inference, particularly in cases where structured data and sample size are limited, by extracting prior knowledge from text corpora. Our proposed few-shot learning approach uses LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features. Our experiments, which involved seven rare tissues from different cancer types, demonstrated that the LLM-based prediction model achieved significant accuracy with very few or zero samples. Our proposed model, the CancerGPT (with ~ 124M parameters), was even comparable to the larger fine-tuned GPT-3 model (with ~ 175B parameters). Our research is the first to tackle drug pair synergy prediction in rare tissues with limited data. We are also the first to utilize an LLM-based prediction model for biological reaction prediction tasks.
format Online
Article
Text
id pubmed-10153348
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cornell University
record_format MEDLINE/PubMed
spelling pubmed-101533482023-05-03 CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models Li, Tianhao Shetty, Sandesh Kamath, Advaith Jaiswal, Ajay Jiang, Xiaoqian Ding, Ying Kim, Yejin ArXiv Article Large pre-trained language models (LLMs) have been shown to have significant potential in few-shot learning across various fields, even with minimal training data. However, their ability to generalize to unseen tasks in more complex fields, such as biology, has yet to be fully evaluated. LLMs can offer a promising alternative approach for biological inference, particularly in cases where structured data and sample size are limited, by extracting prior knowledge from text corpora. Our proposed few-shot learning approach uses LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features. Our experiments, which involved seven rare tissues from different cancer types, demonstrated that the LLM-based prediction model achieved significant accuracy with very few or zero samples. Our proposed model, the CancerGPT (with ~ 124M parameters), was even comparable to the larger fine-tuned GPT-3 model (with ~ 175B parameters). Our research is the first to tackle drug pair synergy prediction in rare tissues with limited data. We are also the first to utilize an LLM-based prediction model for biological reaction prediction tasks. Cornell University 2023-04-18 /pmc/articles/PMC10153348/ /pubmed/37131872 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Li, Tianhao
Shetty, Sandesh
Kamath, Advaith
Jaiswal, Ajay
Jiang, Xiaoqian
Ding, Ying
Kim, Yejin
CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title_full CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title_fullStr CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title_full_unstemmed CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title_short CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models
title_sort cancergpt: few-shot drug pair synergy prediction using large pre-trained language models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10153348/
https://www.ncbi.nlm.nih.gov/pubmed/37131872
work_keys_str_mv AT litianhao cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT shettysandesh cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT kamathadvaith cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT jaiswalajay cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT jiangxiaoqian cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT dingying cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels
AT kimyejin cancergptfewshotdrugpairsynergypredictionusinglargepretrainedlanguagemodels