Cargando…
CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5371246/ https://www.ncbi.nlm.nih.gov/pubmed/28351406 http://dx.doi.org/10.1186/s13059-017-1188-0 |
_version_ | 1782518382383333376 |
---|---|
author | Lin, Peijie Troup, Michael Ho, Joshua W. K. |
author_facet | Lin, Peijie Troup, Michael Ho, Joshua W. K. |
author_sort | Lin, Peijie |
collection | PubMed |
description | Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1188-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5371246 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53712462017-03-30 CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data Lin, Peijie Troup, Michael Ho, Joshua W. K. Genome Biol Software Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-017-1188-0) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-28 /pmc/articles/PMC5371246/ /pubmed/28351406 http://dx.doi.org/10.1186/s13059-017-1188-0 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Lin, Peijie Troup, Michael Ho, Joshua W. K. CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title | CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title_full | CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title_fullStr | CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title_full_unstemmed | CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title_short | CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data |
title_sort | cidr: ultrafast and accurate clustering through imputation for single-cell rna-seq data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5371246/ https://www.ncbi.nlm.nih.gov/pubmed/28351406 http://dx.doi.org/10.1186/s13059-017-1188-0 |
work_keys_str_mv | AT linpeijie cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata AT troupmichael cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata AT hojoshuawk cidrultrafastandaccurateclusteringthroughimputationforsinglecellrnaseqdata |