Cargando…

Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters

BACKGROUND: Single cell RNA sequencing (scRNA-seq) brings unprecedented opportunities for mapping the heterogeneity of complex cellular environments such as bone marrow, and provides insight into many cellular processes. Single cell RNA-seq has a far larger fraction of missing data reported as zeros...

Descripción completa

Detalles Bibliográficos
Autores principales: Domanskyi, Sergii, Szedlak, Anthony, Hawkins, Nathaniel T, Wang, Jiayin, Paternostro, Giovanni, Piermarocchi, Carlo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6604348/
https://www.ncbi.nlm.nih.gov/pubmed/31262249
http://dx.doi.org/10.1186/s12859-019-2951-x
_version_ 1783431693912244224
author Domanskyi, Sergii
Szedlak, Anthony
Hawkins, Nathaniel T
Wang, Jiayin
Paternostro, Giovanni
Piermarocchi, Carlo
author_facet Domanskyi, Sergii
Szedlak, Anthony
Hawkins, Nathaniel T
Wang, Jiayin
Paternostro, Giovanni
Piermarocchi, Carlo
author_sort Domanskyi, Sergii
collection PubMed
description BACKGROUND: Single cell RNA sequencing (scRNA-seq) brings unprecedented opportunities for mapping the heterogeneity of complex cellular environments such as bone marrow, and provides insight into many cellular processes. Single cell RNA-seq has a far larger fraction of missing data reported as zeros (dropouts) than traditional bulk RNA-seq, and unsupervised clustering combined with Principal Component Analysis (PCA) can be used to overcome this limitation. After clustering, however, one has to interpret the average expression of markers on each cluster to identify the corresponding cell types, and this is normally done by hand by an expert curator. RESULTS: We present a computational tool for processing single cell RNA-seq data that uses a voting algorithm to automatically identify cells based on approval votes received by known molecular markers. Using a stochastic procedure that accounts for imbalances in the number of known molecular signatures for different cell types, the method computes the statistical significance of the final approval score and automatically assigns a cell type to clusters without an expert curator. We demonstrate the utility of the tool in the analysis of eight samples of bone marrow from the Human Cell Atlas. The tool provides a systematic identification of cell types in bone marrow based on a list of markers of immune cell types, and incorporates a suite of visualization tools that can be overlaid on a t-SNE representation. The software is freely available as a Python package at https://github.com/sdomanskyi/DigitalCellSorter. CONCLUSIONS: This methodology assures that extensive marker to cell type matching information is taken into account in a systematic way when assigning cell clusters to cell types. Moreover, the method allows for a high throughput processing of multiple scRNA-seq datasets, since it does not involve an expert curator, and it can be applied recursively to obtain cell sub-types. The software is designed to allow the user to substitute the marker to cell type matching information and apply the methodology to different cellular environments.
format Online
Article
Text
id pubmed-6604348
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66043482019-07-12 Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters Domanskyi, Sergii Szedlak, Anthony Hawkins, Nathaniel T Wang, Jiayin Paternostro, Giovanni Piermarocchi, Carlo BMC Bioinformatics Methodology Article BACKGROUND: Single cell RNA sequencing (scRNA-seq) brings unprecedented opportunities for mapping the heterogeneity of complex cellular environments such as bone marrow, and provides insight into many cellular processes. Single cell RNA-seq has a far larger fraction of missing data reported as zeros (dropouts) than traditional bulk RNA-seq, and unsupervised clustering combined with Principal Component Analysis (PCA) can be used to overcome this limitation. After clustering, however, one has to interpret the average expression of markers on each cluster to identify the corresponding cell types, and this is normally done by hand by an expert curator. RESULTS: We present a computational tool for processing single cell RNA-seq data that uses a voting algorithm to automatically identify cells based on approval votes received by known molecular markers. Using a stochastic procedure that accounts for imbalances in the number of known molecular signatures for different cell types, the method computes the statistical significance of the final approval score and automatically assigns a cell type to clusters without an expert curator. We demonstrate the utility of the tool in the analysis of eight samples of bone marrow from the Human Cell Atlas. The tool provides a systematic identification of cell types in bone marrow based on a list of markers of immune cell types, and incorporates a suite of visualization tools that can be overlaid on a t-SNE representation. The software is freely available as a Python package at https://github.com/sdomanskyi/DigitalCellSorter. CONCLUSIONS: This methodology assures that extensive marker to cell type matching information is taken into account in a systematic way when assigning cell clusters to cell types. Moreover, the method allows for a high throughput processing of multiple scRNA-seq datasets, since it does not involve an expert curator, and it can be applied recursively to obtain cell sub-types. The software is designed to allow the user to substitute the marker to cell type matching information and apply the methodology to different cellular environments. BioMed Central 2019-07-01 /pmc/articles/PMC6604348/ /pubmed/31262249 http://dx.doi.org/10.1186/s12859-019-2951-x Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Domanskyi, Sergii
Szedlak, Anthony
Hawkins, Nathaniel T
Wang, Jiayin
Paternostro, Giovanni
Piermarocchi, Carlo
Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title_full Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title_fullStr Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title_full_unstemmed Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title_short Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters
title_sort polled digital cell sorter (p-dcs): automatic identification of hematological cell types from single cell rna-sequencing clusters
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6604348/
https://www.ncbi.nlm.nih.gov/pubmed/31262249
http://dx.doi.org/10.1186/s12859-019-2951-x
work_keys_str_mv AT domanskyisergii polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters
AT szedlakanthony polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters
AT hawkinsnathanielt polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters
AT wangjiayin polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters
AT paternostrogiovanni polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters
AT piermarocchicarlo polleddigitalcellsorterpdcsautomaticidentificationofhematologicalcelltypesfromsinglecellrnasequencingclusters