Cargando…
Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA
Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remai...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7586319/ https://www.ncbi.nlm.nih.gov/pubmed/33193599 http://dx.doi.org/10.3389/fgene.2020.511286 |
_version_ | 1783599971624288256 |
---|---|
author | Lozoya, Oswaldo A. McClelland, Kathryn S. Papas, Brian N. Li, Jian-Liang Yao, Humphrey H.-C. |
author_facet | Lozoya, Oswaldo A. McClelland, Kathryn S. Papas, Brian N. Li, Jian-Liang Yao, Humphrey H.-C. |
author_sort | Lozoya, Oswaldo A. |
collection | PubMed |
description | Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects. |
format | Online Article Text |
id | pubmed-7586319 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-75863192020-11-13 Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA Lozoya, Oswaldo A. McClelland, Kathryn S. Papas, Brian N. Li, Jian-Liang Yao, Humphrey H.-C. Front Genet Genetics Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects. Frontiers Media S.A. 2020-10-09 /pmc/articles/PMC7586319/ /pubmed/33193599 http://dx.doi.org/10.3389/fgene.2020.511286 Text en Copyright © 2020 Lozoya, McClelland, Papas, Li and Yao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Lozoya, Oswaldo A. McClelland, Kathryn S. Papas, Brian N. Li, Jian-Liang Yao, Humphrey H.-C. Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title | Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title_full | Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title_fullStr | Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title_full_unstemmed | Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title_short | Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA |
title_sort | patterns, profiles, and parsimony: dissecting transcriptional signatures from minimal single-cell rna-seq output with salsa |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7586319/ https://www.ncbi.nlm.nih.gov/pubmed/33193599 http://dx.doi.org/10.3389/fgene.2020.511286 |
work_keys_str_mv | AT lozoyaoswaldoa patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa AT mcclellandkathryns patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa AT papasbriann patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa AT lijianliang patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa AT yaohumphreyhc patternsprofilesandparsimonydissectingtranscriptionalsignaturesfromminimalsinglecellrnaseqoutputwithsalsa |