Cargando…
Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing
Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample o...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7264361/ https://www.ncbi.nlm.nih.gov/pubmed/32483174 http://dx.doi.org/10.1038/s41467-020-16522-z |
_version_ | 1783540957956800512 |
---|---|
author | Farouni, Rick Djambazian, Haig Ferri, Lorenzo E. Ragoussis, Jiannis Najafabadi, Hamed S. |
author_facet | Farouni, Rick Djambazian, Haig Ferri, Lorenzo E. Ragoussis, Jiannis Najafabadi, Hamed S. |
author_sort | Farouni, Rick |
collection | PubMed |
description | Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample of origin of hopped reads. We analyze several datasets and estimate the sample index hopping probability to range between 0.003–0.009, a small number that counter-intuitively gives rise to a large fraction of phantom molecules — the fraction of phantom molecules exceeds 8% in more than 25% of samples and reaches as high as 85% in low-complexity samples. Phantom molecules lead to widespread complications in downstream analyses, including transcriptome mixing across cells, emergence of phantom copies of cells from other samples, and misclassification of empty droplets as cells. We demonstrate that our approach can correct for these artifacts by accurately purging the majority of phantom molecules from the data. |
format | Online Article Text |
id | pubmed-7264361 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-72643612020-06-12 Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing Farouni, Rick Djambazian, Haig Ferri, Lorenzo E. Ragoussis, Jiannis Najafabadi, Hamed S. Nat Commun Article Index hopping is the main cause of incorrect sample assignment of sequencing reads in multiplexed pooled libraries. We introduce a statistical model for estimating the sample index-hopping rate in multiplexed droplet-based single-cell RNA-seq data and for probabilistic inference of the true sample of origin of hopped reads. We analyze several datasets and estimate the sample index hopping probability to range between 0.003–0.009, a small number that counter-intuitively gives rise to a large fraction of phantom molecules — the fraction of phantom molecules exceeds 8% in more than 25% of samples and reaches as high as 85% in low-complexity samples. Phantom molecules lead to widespread complications in downstream analyses, including transcriptome mixing across cells, emergence of phantom copies of cells from other samples, and misclassification of empty droplets as cells. We demonstrate that our approach can correct for these artifacts by accurately purging the majority of phantom molecules from the data. Nature Publishing Group UK 2020-06-01 /pmc/articles/PMC7264361/ /pubmed/32483174 http://dx.doi.org/10.1038/s41467-020-16522-z Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Farouni, Rick Djambazian, Haig Ferri, Lorenzo E. Ragoussis, Jiannis Najafabadi, Hamed S. Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title | Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title_full | Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title_fullStr | Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title_full_unstemmed | Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title_short | Model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell RNA-sequencing |
title_sort | model-based analysis of sample index hopping reveals its widespread artifacts in multiplexed single-cell rna-sequencing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7264361/ https://www.ncbi.nlm.nih.gov/pubmed/32483174 http://dx.doi.org/10.1038/s41467-020-16522-z |
work_keys_str_mv | AT farounirick modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing AT djambazianhaig modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing AT ferrilorenzoe modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing AT ragoussisjiannis modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing AT najafabadihameds modelbasedanalysisofsampleindexhoppingrevealsitswidespreadartifactsinmultiplexedsinglecellrnasequencing |