Cargando…

CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data

Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Peijie, Troup, Michael, Ho, Joshua W. K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5371246/
https://www.ncbi.nlm.nih.gov/pubmed/28351406
http://dx.doi.org/10.1186/s13059-017-1188-0
_version_ 1782518382383333376
author Lin, Peijie
Troup, Michael
Ho, Joshua W. K.
author_facet Lin, Peijie
Troup, Michael
Ho, Joshua W. K.
author_sort Lin, Peijie
collection PubMed
description Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1188-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5371246
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53712462017-03-30 CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data Lin, Peijie Troup, Michael Ho, Joshua W. K. Genome Biol Software Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1188-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-28 /pmc/articles/PMC5371246/ /pubmed/28351406 http://dx.doi.org/10.1186/s13059-017-1188-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Lin, Peijie
Troup, Michael
Ho, Joshua W. K.
CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title_full CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title_fullStr CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title_full_unstemmed CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title_short CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
title_sort cidr: ultrafast and accurate clustering through imputation for single-cell rna-seq data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5371246/
https://www.ncbi.nlm.nih.gov/pubmed/28351406
http://dx.doi.org/10.1186/s13059-017-1188-0
work_keys_str_mv AT linpeijie cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata
AT troupmichael cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata
AT hojoshuawk cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata