Cargando…
ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data
MOTIVATION: Single-cell RNA-seq allows researchers to identify cell populations based on unsupervised clustering of the transcriptome. However, subpopulations can have only subtle transcriptomic differences and the high dimensionality of the data makes their identification challenging. RESULTS: We i...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8150131/ https://www.ncbi.nlm.nih.gov/pubmed/33151294 http://dx.doi.org/10.1093/bioinformatics/btaa919 |
_version_ | 1783698096380706816 |
---|---|
author | Smolander, Johannes Junttila, Sini Venäläinen, Mikko S Elo, Laura L |
author_facet | Smolander, Johannes Junttila, Sini Venäläinen, Mikko S Elo, Laura L |
author_sort | Smolander, Johannes |
collection | PubMed |
description | MOTIVATION: Single-cell RNA-seq allows researchers to identify cell populations based on unsupervised clustering of the transcriptome. However, subpopulations can have only subtle transcriptomic differences and the high dimensionality of the data makes their identification challenging. RESULTS: We introduce ILoReg, an R package implementing a new cell population identification method that improves identification of cell populations with subtle differences through a probabilistic feature extraction step that is applied before clustering and visualization. The feature extraction is performed using a novel machine learning algorithm, called iterative clustering projection (ICP), that uses logistic regression and clustering similarity comparison to iteratively cluster data. Remarkably, ICP also manages to integrate feature selection with the clustering through L1-regularization, enabling the identification of genes that are differentially expressed between cell populations. By combining solutions of multiple ICP runs into a single consensus solution, ILoReg creates a representation that enables investigating cell populations with a high resolution. In particular, we show that the visualization of ILoReg allows segregation of immune and pancreatic cell populations in a more pronounced manner compared with current state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION: ILoReg is available as an R package at https://bioconductor.org/packages/ILoReg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8150131 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-81501312021-05-28 ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data Smolander, Johannes Junttila, Sini Venäläinen, Mikko S Elo, Laura L Bioinformatics Original Papers MOTIVATION: Single-cell RNA-seq allows researchers to identify cell populations based on unsupervised clustering of the transcriptome. However, subpopulations can have only subtle transcriptomic differences and the high dimensionality of the data makes their identification challenging. RESULTS: We introduce ILoReg, an R package implementing a new cell population identification method that improves identification of cell populations with subtle differences through a probabilistic feature extraction step that is applied before clustering and visualization. The feature extraction is performed using a novel machine learning algorithm, called iterative clustering projection (ICP), that uses logistic regression and clustering similarity comparison to iteratively cluster data. Remarkably, ICP also manages to integrate feature selection with the clustering through L1-regularization, enabling the identification of genes that are differentially expressed between cell populations. By combining solutions of multiple ICP runs into a single consensus solution, ILoReg creates a representation that enables investigating cell populations with a high resolution. In particular, we show that the visualization of ILoReg allows segregation of immune and pancreatic cell populations in a more pronounced manner compared with current state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION: ILoReg is available as an R package at https://bioconductor.org/packages/ILoReg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-12-13 /pmc/articles/PMC8150131/ /pubmed/33151294 http://dx.doi.org/10.1093/bioinformatics/btaa919 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Smolander, Johannes Junttila, Sini Venäläinen, Mikko S Elo, Laura L ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title | ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title_full | ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title_fullStr | ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title_full_unstemmed | ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title_short | ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data |
title_sort | iloreg: a tool for high-resolution cell population identification from single-cell rna-seq data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8150131/ https://www.ncbi.nlm.nih.gov/pubmed/33151294 http://dx.doi.org/10.1093/bioinformatics/btaa919 |
work_keys_str_mv | AT smolanderjohannes iloregatoolforhighresolutioncellpopulationidentificationfromsinglecellrnaseqdata AT junttilasini iloregatoolforhighresolutioncellpopulationidentificationfromsinglecellrnaseqdata AT venalainenmikkos iloregatoolforhighresolutioncellpopulationidentificationfromsinglecellrnaseqdata AT elolaural iloregatoolforhighresolutioncellpopulationidentificationfromsinglecellrnaseqdata |