Cargando…

Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data

By profiling the transcriptomes of individual cells, single-cell RNA sequencing provides unparalleled resolution to study cellular heterogeneity. However, this comes at the cost of high technical noise, including cell-specific biases in capture efficiency and library generation. One strategy for rem...

Descripción completa

Detalles Bibliográficos
Autores principales: Lun, Aaron T.L., Calero-Nieto, Fernando J., Haim-Vilmovsky, Liora, Göttgens, Berthold, Marioni, John C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5668938/
https://www.ncbi.nlm.nih.gov/pubmed/29030468
http://dx.doi.org/10.1101/gr.222877.117
_version_ 1783275763818037248
author Lun, Aaron T.L.
Calero-Nieto, Fernando J.
Haim-Vilmovsky, Liora
Göttgens, Berthold
Marioni, John C.
author_facet Lun, Aaron T.L.
Calero-Nieto, Fernando J.
Haim-Vilmovsky, Liora
Göttgens, Berthold
Marioni, John C.
author_sort Lun, Aaron T.L.
collection PubMed
description By profiling the transcriptomes of individual cells, single-cell RNA sequencing provides unparalleled resolution to study cellular heterogeneity. However, this comes at the cost of high technical noise, including cell-specific biases in capture efficiency and library generation. One strategy for removing these biases is to add a constant amount of spike-in RNA to each cell and to scale the observed expression values so that the coverage of spike-in transcripts is constant across cells. This approach has previously been criticized as its accuracy depends on the precise addition of spike-in RNA to each sample. Here, we perform mixture experiments using two different sets of spike-in RNA to quantify the variance in the amount of spike-in RNA added to each well in a plate-based protocol. We also obtain an upper bound on the variance due to differences in behavior between the two spike-in sets. We demonstrate that both factors are small contributors to the total technical variance and have only minor effects on downstream analyses, such as detection of highly variable genes and clustering. Our results suggest that scaling normalization using spike-in transcripts is reliable enough for routine use in single-cell RNA sequencing data analyses.
format Online
Article
Text
id pubmed-5668938
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-56689382017-11-13 Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data Lun, Aaron T.L. Calero-Nieto, Fernando J. Haim-Vilmovsky, Liora Göttgens, Berthold Marioni, John C. Genome Res Research By profiling the transcriptomes of individual cells, single-cell RNA sequencing provides unparalleled resolution to study cellular heterogeneity. However, this comes at the cost of high technical noise, including cell-specific biases in capture efficiency and library generation. One strategy for removing these biases is to add a constant amount of spike-in RNA to each cell and to scale the observed expression values so that the coverage of spike-in transcripts is constant across cells. This approach has previously been criticized as its accuracy depends on the precise addition of spike-in RNA to each sample. Here, we perform mixture experiments using two different sets of spike-in RNA to quantify the variance in the amount of spike-in RNA added to each well in a plate-based protocol. We also obtain an upper bound on the variance due to differences in behavior between the two spike-in sets. We demonstrate that both factors are small contributors to the total technical variance and have only minor effects on downstream analyses, such as detection of highly variable genes and clustering. Our results suggest that scaling normalization using spike-in transcripts is reliable enough for routine use in single-cell RNA sequencing data analyses. Cold Spring Harbor Laboratory Press 2017-11 /pmc/articles/PMC5668938/ /pubmed/29030468 http://dx.doi.org/10.1101/gr.222877.117 Text en © 2017 Lun et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Research
Lun, Aaron T.L.
Calero-Nieto, Fernando J.
Haim-Vilmovsky, Liora
Göttgens, Berthold
Marioni, John C.
Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title_full Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title_fullStr Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title_full_unstemmed Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title_short Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data
title_sort assessing the reliability of spike-in normalization for analyses of single-cell rna sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5668938/
https://www.ncbi.nlm.nih.gov/pubmed/29030468
http://dx.doi.org/10.1101/gr.222877.117
work_keys_str_mv AT lunaarontl assessingthereliabilityofspikeinnormalizationforanalysesofsinglecellrnasequencingdata
AT caleronietofernandoj assessingthereliabilityofspikeinnormalizationforanalysesofsinglecellrnasequencingdata
AT haimvilmovskyliora assessingthereliabilityofspikeinnormalizationforanalysesofsinglecellrnasequencingdata
AT gottgensberthold assessingthereliabilityofspikeinnormalizationforanalysesofsinglecellrnasequencingdata
AT marionijohnc assessingthereliabilityofspikeinnormalizationforanalysesofsinglecellrnasequencingdata