Cargando…

Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing

Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample o...

Descripción completa

Detalles Bibliográficos
Autores principales: Farouni, Rick, Djambazian, Haig, Ferri, Lorenzo E., Ragoussis, Jiannis, Najafabadi, Hamed S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7264361/
https://www.ncbi.nlm.nih.gov/pubmed/32483174
http://dx.doi.org/10.1038/s41467-020-16522-z
_version_ 1783540957956800512
author Farouni, Rick
Djambazian, Haig
Ferri, Lorenzo E.
Ragoussis, Jiannis
Najafabadi, Hamed S.
author_facet Farouni, Rick
Djambazian, Haig
Ferri, Lorenzo E.
Ragoussis, Jiannis
Najafabadi, Hamed S.
author_sort Farouni, Rick
collection PubMed
description Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample of origin of hopped reads. We analyze several datasets and estimate the sample index hopping probability to range between 0.003–0.009, a small number that counter-intuitively gives rise to a large fraction of phantom molecules — the fraction of phantom molecules exceeds 8% in more than 25% of samples and reaches as high as 85% in low-complexity samples. Phantom molecules lead to widespread complications in downstream analyses, including transcriptome mixing across cells, emergence of phantom copies of cells from other samples, and misclassification of empty droplets as cells. We demonstrate that our approach can correct for these artifacts by accurately purging the majority of phantom molecules from the data.
format Online
Article
Text
id pubmed-7264361
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-72643612020-06-12 Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing Farouni, Rick Djambazian, Haig Ferri, Lorenzo E. Ragoussis, Jiannis Najafabadi, Hamed S. Nat Commun Article Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample of origin of hopped reads. We analyze several datasets and estimate the sample index hopping probability to range between 0.003–0.009, a small number that counter-intuitively gives rise to a large fraction of phantom molecules — the fraction of phantom molecules exceeds 8% in more than 25% of samples and reaches as high as 85% in low-complexity samples. Phantom molecules lead to widespread complications in downstream analyses, including transcriptome mixing across cells, emergence of phantom copies of cells from other samples, and misclassification of empty droplets as cells. We demonstrate that our approach can correct for these artifacts by accurately purging the majority of phantom molecules from the data. Nature Publishing Group UK 2020-06-01 /pmc/articles/PMC7264361/ /pubmed/32483174 http://dx.doi.org/10.1038/s41467-020-16522-z Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Farouni, Rick
Djambazian, Haig
Ferri, Lorenzo E.
Ragoussis, Jiannis
Najafabadi, Hamed S.
Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title_full Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title_fullStr Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title_full_unstemmed Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title_short Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
title_sort model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell rna-sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7264361/
https://www.ncbi.nlm.nih.gov/pubmed/32483174
http://dx.doi.org/10.1038/s41467-020-16522-z
work_keys_str_mv AT farounirick modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing
AT djambazianhaig modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing
AT ferrilorenzoe modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing
AT ragoussisjiannis modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing
AT najafabadihameds modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing