Cargando…

qSNE: quadratic rate t-SNE optimizer with automatic parameter tuning for large datasets

MOTIVATION: Non-parametric dimensionality reduction techniques, such as t-distributed stochastic neighbor embedding (t-SNE), are the most frequently used methods in the exploratory analysis of single-cell datasets. Current implementations scale poorly to massive datasets and often require downsampli...

Descripción completa

Detalles Bibliográficos
Autores principales: Häkkinen, Antti, Koiranen, Juha, Casado, Julia, Kaipio, Katja, Lehtonen, Oskari, Petrucci, Eleonora, Hynninen, Johanna, Hietanen, Sakari, Carpén, Olli, Pasquini, Luca, Biffoni, Mauro, Lehtonen, Rainer, Hautaniemi, Sampsa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755412/
https://www.ncbi.nlm.nih.gov/pubmed/32663244
http://dx.doi.org/10.1093/bioinformatics/btaa637
Descripción
Sumario:MOTIVATION: Non-parametric dimensionality reduction techniques, such as t-distributed stochastic neighbor embedding (t-SNE), are the most frequently used methods in the exploratory analysis of single-cell datasets. Current implementations scale poorly to massive datasets and often require downsampling or interpolative approximations, which can leave less-frequent populations undiscovered and much information unexploited. RESULTS: We implemented a fast t-SNE package, qSNE, which uses a quasi-Newton optimizer, allowing quadratic convergence rate and automatic perplexity (level of detail) optimizer. Our results show that these improvements make qSNE significantly faster than regular t-SNE packages and enables full analysis of large datasets, such as mass cytometry data, without downsampling. AVAILABILITY AND IMPLEMENTATION: Source code and documentation are openly available at https://bitbucket.org/anthakki/qsne/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.