Cargando…

Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics

MOTIVATION: Analysis of singe cell RNA sequencing (scRNA-seq) typically consists of different steps including quality control, batch correction, clustering, cell identification and characterization, and visualization. The amount of scRNA-seq data is growing extremely fast, and novel algorithmic appr...

Descripción completa

Detalles Bibliográficos
Autores principales: Domanskyi, Sergii, Hakansson, Alex, Bertus, Thomas J., Paternostro, Giovanni, Piermarocchi, Carlo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7811293/
https://www.ncbi.nlm.nih.gov/pubmed/33520459
http://dx.doi.org/10.7717/peerj.10670
_version_ 1783637466890108928
author Domanskyi, Sergii
Hakansson, Alex
Bertus, Thomas J.
Paternostro, Giovanni
Piermarocchi, Carlo
author_facet Domanskyi, Sergii
Hakansson, Alex
Bertus, Thomas J.
Paternostro, Giovanni
Piermarocchi, Carlo
author_sort Domanskyi, Sergii
collection PubMed
description MOTIVATION: Analysis of singe cell RNA sequencing (scRNA-seq) typically consists of different steps including quality control, batch correction, clustering, cell identification and characterization, and visualization. The amount of scRNA-seq data is growing extremely fast, and novel algorithmic approaches improving these steps are key to extract more biological information. Here, we introduce: (i) two methods for automatic cell type identification (i.e., without expert curator) based on a voting algorithm and a Hopfield classifier, (ii) a method for cell anomaly quantification based on isolation forest, and (iii) a tool for the visualization of cell phenotypic landscapes based on Hopfield energy-like functions. These new approaches are integrated in a software platform that includes many other state-of-the-art methodologies and provides a self-contained toolkit for scRNA-seq analysis. RESULTS: We present a suite of software elements for the analysis of scRNA-seq data. This Python-based open source software, Digital Cell Sorter (DCS), consists in an extensive toolkit of methods for scRNA-seq analysis. We illustrate the capability of the software using data from large datasets of peripheral blood mononuclear cells (PBMC), as well as plasma cells of bone marrow samples from healthy donors and multiple myeloma patients. We test the novel algorithms by evaluating their ability to deconvolve cell mixtures and detect small numbers of anomalous cells in PBMC data. AVAILABILITY: The DCS toolkit is available for download and installation through the Python Package Index (PyPI). The software can be deployed using the Python import function following installation. Source code is also available for download on Zenodo: DOI 10.5281/zenodo.2533377. SUPPLEMENTARY INFORMATION: Supplemental Materials are available at PeerJ online.
format Online
Article
Text
id pubmed-7811293
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-78112932021-01-28 Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics Domanskyi, Sergii Hakansson, Alex Bertus, Thomas J. Paternostro, Giovanni Piermarocchi, Carlo PeerJ Bioinformatics MOTIVATION: Analysis of singe cell RNA sequencing (scRNA-seq) typically consists of different steps including quality control, batch correction, clustering, cell identification and characterization, and visualization. The amount of scRNA-seq data is growing extremely fast, and novel algorithmic approaches improving these steps are key to extract more biological information. Here, we introduce: (i) two methods for automatic cell type identification (i.e., without expert curator) based on a voting algorithm and a Hopfield classifier, (ii) a method for cell anomaly quantification based on isolation forest, and (iii) a tool for the visualization of cell phenotypic landscapes based on Hopfield energy-like functions. These new approaches are integrated in a software platform that includes many other state-of-the-art methodologies and provides a self-contained toolkit for scRNA-seq analysis. RESULTS: We present a suite of software elements for the analysis of scRNA-seq data. This Python-based open source software, Digital Cell Sorter (DCS), consists in an extensive toolkit of methods for scRNA-seq analysis. We illustrate the capability of the software using data from large datasets of peripheral blood mononuclear cells (PBMC), as well as plasma cells of bone marrow samples from healthy donors and multiple myeloma patients. We test the novel algorithms by evaluating their ability to deconvolve cell mixtures and detect small numbers of anomalous cells in PBMC data. AVAILABILITY: The DCS toolkit is available for download and installation through the Python Package Index (PyPI). The software can be deployed using the Python import function following installation. Source code is also available for download on Zenodo: DOI 10.5281/zenodo.2533377. SUPPLEMENTARY INFORMATION: Supplemental Materials are available at PeerJ online. PeerJ Inc. 2021-01-13 /pmc/articles/PMC7811293/ /pubmed/33520459 http://dx.doi.org/10.7717/peerj.10670 Text en © 2021 Domanskyi et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Domanskyi, Sergii
Hakansson, Alex
Bertus, Thomas J.
Paternostro, Giovanni
Piermarocchi, Carlo
Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title_full Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title_fullStr Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title_full_unstemmed Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title_short Digital Cell Sorter (DCS): a cell type identification, anomaly detection, and Hopfield landscapes toolkit for single-cell transcriptomics
title_sort digital cell sorter (dcs): a cell type identification, anomaly detection, and hopfield landscapes toolkit for single-cell transcriptomics
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7811293/
https://www.ncbi.nlm.nih.gov/pubmed/33520459
http://dx.doi.org/10.7717/peerj.10670
work_keys_str_mv AT domanskyisergii digitalcellsorterdcsacelltypeidentificationanomalydetectionandhopfieldlandscapestoolkitforsinglecelltranscriptomics
AT hakanssonalex digitalcellsorterdcsacelltypeidentificationanomalydetectionandhopfieldlandscapestoolkitforsinglecelltranscriptomics
AT bertusthomasj digitalcellsorterdcsacelltypeidentificationanomalydetectionandhopfieldlandscapestoolkitforsinglecelltranscriptomics
AT paternostrogiovanni digitalcellsorterdcsacelltypeidentificationanomalydetectionandhopfieldlandscapestoolkitforsinglecelltranscriptomics
AT piermarocchicarlo digitalcellsorterdcsacelltypeidentificationanomalydetectionandhopfieldlandscapestoolkitforsinglecelltranscriptomics