Cargando…

Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references

Characterizing the molecular identity of a cell is an essential step in single-cell RNA sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single-cell reference atlases. However, many challenges remain, including correcting for inherent batch effects betwee...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Yidi, Choi, Jarny, Lê Cao, Kim-Anh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9155616/
https://www.ncbi.nlm.nih.gov/pubmed/35362513
http://dx.doi.org/10.1093/bib/bbac088
_version_ 1784718276393172992
author Deng, Yidi
Choi, Jarny
Lê Cao, Kim-Anh
author_facet Deng, Yidi
Choi, Jarny
Lê Cao, Kim-Anh
author_sort Deng, Yidi
collection PubMed
description Characterizing the molecular identity of a cell is an essential step in single-cell RNA sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single-cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data andinsufficient phenotype data from the reference. One solution is to project single-cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data by projection onto bulk reference atlases. Prior to projection, single-cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single-cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single-cell profiling that will facilitate downstream analysis of scRNA-seq data.
format Online
Article
Text
id pubmed-9155616
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91556162022-06-04 Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references Deng, Yidi Choi, Jarny Lê Cao, Kim-Anh Brief Bioinform Problem Solving Protocol Characterizing the molecular identity of a cell is an essential step in single-cell RNA sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single-cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data andinsufficient phenotype data from the reference. One solution is to project single-cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data by projection onto bulk reference atlases. Prior to projection, single-cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single-cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single-cell profiling that will facilitate downstream analysis of scRNA-seq data. Oxford University Press 2022-03-31 /pmc/articles/PMC9155616/ /pubmed/35362513 http://dx.doi.org/10.1093/bib/bbac088 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Problem Solving Protocol
Deng, Yidi
Choi, Jarny
Lê Cao, Kim-Anh
Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title_full Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title_fullStr Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title_full_unstemmed Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title_short Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
title_sort sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9155616/
https://www.ncbi.nlm.nih.gov/pubmed/35362513
http://dx.doi.org/10.1093/bib/bbac088
work_keys_str_mv AT dengyidi sincastacomputationalframeworktopredictcellidentitiesinsinglecelltranscriptomesusingbulkatlasesasreferences
AT choijarny sincastacomputationalframeworktopredictcellidentitiesinsinglecelltranscriptomesusingbulkatlasesasreferences
AT lecaokimanh sincastacomputationalframeworktopredictcellidentitiesinsinglecelltranscriptomesusingbulkatlasesasreferences