Cargando…
An integrated single-cell transcriptomic dataset for non-small cell lung cancer
As single-cell RNA sequencing (scRNA-seq) has emerged as a great tool for studying cellular heterogeneity within the past decade, the number of available scRNA-seq datasets also rapidly increased. However, reuse of such data is often problematic due to a small cohort size, limited cell types, and in...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10042991/ https://www.ncbi.nlm.nih.gov/pubmed/36973297 http://dx.doi.org/10.1038/s41597-023-02074-6 |
_version_ | 1784913053330964480 |
---|---|
author | Prazanowska, Karolina Hanna Lim, Su Bin |
author_facet | Prazanowska, Karolina Hanna Lim, Su Bin |
author_sort | Prazanowska, Karolina Hanna |
collection | PubMed |
description | As single-cell RNA sequencing (scRNA-seq) has emerged as a great tool for studying cellular heterogeneity within the past decade, the number of available scRNA-seq datasets also rapidly increased. However, reuse of such data is often problematic due to a small cohort size, limited cell types, and insufficient information on cell type classification. Here, we present a large integrated scRNA-seq dataset containing 224,611 cells from human primary non-small cell lung cancer (NSCLC) tumors. Using publicly available resources, we pre-processed and integrated seven independent scRNA-seq datasets using an anchor-based approach, with five datasets utilized as reference and the remaining two, as validation. We created two levels of annotation based on cell type-specific markers conserved across the datasets. To demonstrate usability of the integrated dataset, we created annotation predictions for the two validation datasets using our integrated reference. Additionally, we conducted a trajectory analysis on subsets of T cells and lung cancer cells. This integrated data may serve as a resource for studying NSCLC transcriptome at the single cell level. |
format | Online Article Text |
id | pubmed-10042991 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-100429912023-03-29 An integrated single-cell transcriptomic dataset for non-small cell lung cancer Prazanowska, Karolina Hanna Lim, Su Bin Sci Data Analysis As single-cell RNA sequencing (scRNA-seq) has emerged as a great tool for studying cellular heterogeneity within the past decade, the number of available scRNA-seq datasets also rapidly increased. However, reuse of such data is often problematic due to a small cohort size, limited cell types, and insufficient information on cell type classification. Here, we present a large integrated scRNA-seq dataset containing 224,611 cells from human primary non-small cell lung cancer (NSCLC) tumors. Using publicly available resources, we pre-processed and integrated seven independent scRNA-seq datasets using an anchor-based approach, with five datasets utilized as reference and the remaining two, as validation. We created two levels of annotation based on cell type-specific markers conserved across the datasets. To demonstrate usability of the integrated dataset, we created annotation predictions for the two validation datasets using our integrated reference. Additionally, we conducted a trajectory analysis on subsets of T cells and lung cancer cells. This integrated data may serve as a resource for studying NSCLC transcriptome at the single cell level. Nature Publishing Group UK 2023-03-27 /pmc/articles/PMC10042991/ /pubmed/36973297 http://dx.doi.org/10.1038/s41597-023-02074-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Analysis Prazanowska, Karolina Hanna Lim, Su Bin An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title | An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title_full | An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title_fullStr | An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title_full_unstemmed | An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title_short | An integrated single-cell transcriptomic dataset for non-small cell lung cancer |
title_sort | integrated single-cell transcriptomic dataset for non-small cell lung cancer |
topic | Analysis |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10042991/ https://www.ncbi.nlm.nih.gov/pubmed/36973297 http://dx.doi.org/10.1038/s41597-023-02074-6 |
work_keys_str_mv | AT prazanowskakarolinahanna anintegratedsinglecelltranscriptomicdatasetfornonsmallcelllungcancer AT limsubin anintegratedsinglecelltranscriptomicdatasetfornonsmallcelllungcancer AT prazanowskakarolinahanna integratedsinglecelltranscriptomicdatasetfornonsmallcelllungcancer AT limsubin integratedsinglecelltranscriptomicdatasetfornonsmallcelllungcancer |