Cargando…

Generative pretraining from large-scale transcriptomes for single-cell deciphering

Exponential accumulation of single-cell transcriptomes poses great challenge for efficient assimilation. Here, we present an approach entitled generative pretraining from transcriptomes (tGPT) for learning feature representation of transcriptomes. tGPT is conceptually simple in that it autoregressiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, Hongru, Liu, Jilei, Hu, Jiani, Shen, Xilin, Zhang, Chao, Wu, Dan, Feng, Mengyao, Yang, Meng, Li, Yang, Yang, Yichen, Wang, Wei, Zhang, Qiang, Yang, Jilong, Chen, Kexin, Li, Xiangchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176267/
https://www.ncbi.nlm.nih.gov/pubmed/37187700
http://dx.doi.org/10.1016/j.isci.2023.106536
_version_ 1785040394892869632
author Shen, Hongru
Liu, Jilei
Hu, Jiani
Shen, Xilin
Zhang, Chao
Wu, Dan
Feng, Mengyao
Yang, Meng
Li, Yang
Yang, Yichen
Wang, Wei
Zhang, Qiang
Yang, Jilong
Chen, Kexin
Li, Xiangchun
author_facet Shen, Hongru
Liu, Jilei
Hu, Jiani
Shen, Xilin
Zhang, Chao
Wu, Dan
Feng, Mengyao
Yang, Meng
Li, Yang
Yang, Yichen
Wang, Wei
Zhang, Qiang
Yang, Jilong
Chen, Kexin
Li, Xiangchun
author_sort Shen, Hongru
collection PubMed
description Exponential accumulation of single-cell transcriptomes poses great challenge for efficient assimilation. Here, we present an approach entitled generative pretraining from transcriptomes (tGPT) for learning feature representation of transcriptomes. tGPT is conceptually simple in that it autoregressive models the ranking of a gene in the context of its preceding neighbors. We developed tGPT with 22.3 million single-cell transcriptomes and used four single-cell datasets to evalutate its performance on single-cell analysis tasks. In addition, we examine its applications on bulk tissues. The single-cell clusters and cell lineage trajectories derived from tGPT are highly aligned with known cell labels and states. The feature patterns of tumor bulk tissues learned by tGPT are associated with a wide range of genomic alteration events, prognosis, and treatment outcome of immunotherapy. tGPT represents a new analytical paradigm for integrating and deciphering massive amounts of transcriptome data and it will facilitate the interpretation and clinical translation of single-cell transcriptomes.
format Online
Article
Text
id pubmed-10176267
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-101762672023-05-13 Generative pretraining from large-scale transcriptomes for single-cell deciphering Shen, Hongru Liu, Jilei Hu, Jiani Shen, Xilin Zhang, Chao Wu, Dan Feng, Mengyao Yang, Meng Li, Yang Yang, Yichen Wang, Wei Zhang, Qiang Yang, Jilong Chen, Kexin Li, Xiangchun iScience Article Exponential accumulation of single-cell transcriptomes poses great challenge for efficient assimilation. Here, we present an approach entitled generative pretraining from transcriptomes (tGPT) for learning feature representation of transcriptomes. tGPT is conceptually simple in that it autoregressive models the ranking of a gene in the context of its preceding neighbors. We developed tGPT with 22.3 million single-cell transcriptomes and used four single-cell datasets to evalutate its performance on single-cell analysis tasks. In addition, we examine its applications on bulk tissues. The single-cell clusters and cell lineage trajectories derived from tGPT are highly aligned with known cell labels and states. The feature patterns of tumor bulk tissues learned by tGPT are associated with a wide range of genomic alteration events, prognosis, and treatment outcome of immunotherapy. tGPT represents a new analytical paradigm for integrating and deciphering massive amounts of transcriptome data and it will facilitate the interpretation and clinical translation of single-cell transcriptomes. Elsevier 2023-04-20 /pmc/articles/PMC10176267/ /pubmed/37187700 http://dx.doi.org/10.1016/j.isci.2023.106536 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Shen, Hongru
Liu, Jilei
Hu, Jiani
Shen, Xilin
Zhang, Chao
Wu, Dan
Feng, Mengyao
Yang, Meng
Li, Yang
Yang, Yichen
Wang, Wei
Zhang, Qiang
Yang, Jilong
Chen, Kexin
Li, Xiangchun
Generative pretraining from large-scale transcriptomes for single-cell deciphering
title Generative pretraining from large-scale transcriptomes for single-cell deciphering
title_full Generative pretraining from large-scale transcriptomes for single-cell deciphering
title_fullStr Generative pretraining from large-scale transcriptomes for single-cell deciphering
title_full_unstemmed Generative pretraining from large-scale transcriptomes for single-cell deciphering
title_short Generative pretraining from large-scale transcriptomes for single-cell deciphering
title_sort generative pretraining from large-scale transcriptomes for single-cell deciphering
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10176267/
https://www.ncbi.nlm.nih.gov/pubmed/37187700
http://dx.doi.org/10.1016/j.isci.2023.106536
work_keys_str_mv AT shenhongru generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT liujilei generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT hujiani generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT shenxilin generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT zhangchao generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT wudan generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT fengmengyao generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT yangmeng generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT liyang generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT yangyichen generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT wangwei generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT zhangqiang generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT yangjilong generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT chenkexin generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering
AT lixiangchun generativepretrainingfromlargescaletranscriptomesforsinglecelldeciphering