Cargando…

Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA

Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remai...

Descripción completa

Detalles Bibliográficos
Autores principales: Lozoya, Oswaldo A., McClelland, Kathryn S., Papas, Brian N., Li, Jian-Liang, Yao, Humphrey H.-C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7586319/
https://www.ncbi.nlm.nih.gov/pubmed/33193599
http://dx.doi.org/10.3389/fgene.2020.511286
_version_ 1783599971624288256
author Lozoya, Oswaldo A.
McClelland, Kathryn S.
Papas, Brian N.
Li, Jian-Liang
Yao, Humphrey H.-C.
author_facet Lozoya, Oswaldo A.
McClelland, Kathryn S.
Papas, Brian N.
Li, Jian-Liang
Yao, Humphrey H.-C.
author_sort Lozoya, Oswaldo A.
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects.
format Online
Article
Text
id pubmed-7586319
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-75863192020-11-13 Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA Lozoya, Oswaldo A. McClelland, Kathryn S. Papas, Brian N. Li, Jian-Liang Yao, Humphrey H.-C. Front Genet Genetics Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects. Frontiers Media S.A. 2020-10-09 /pmc/articles/PMC7586319/ /pubmed/33193599 http://dx.doi.org/10.3389/fgene.2020.511286 Text en Copyright © 2020 Lozoya, McClelland, Papas, Li and Yao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Lozoya, Oswaldo A.
McClelland, Kathryn S.
Papas, Brian N.
Li, Jian-Liang
Yao, Humphrey H.-C.
Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title_full Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title_fullStr Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title_full_unstemmed Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title_short Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
title_sort patterns, profiles, and parsimony: dissecting transcriptional signatures from minimal single-cell rna-seq output with salsa
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7586319/
https://www.ncbi.nlm.nih.gov/pubmed/33193599
http://dx.doi.org/10.3389/fgene.2020.511286
work_keys_str_mv AT lozoyaoswaldoa patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa
AT mcclellandkathryns patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa
AT papasbriann patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa
AT lijianliang patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa
AT yaohumphreyhc patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa