Cargando…

NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration

Non-negative Matrix Factorization (NME) is an algorithm that can reduce high dimensional datasets of tens of thousands of genes to a handful of metagenes which are biologically easier to interpret. Application of NMF on gene expression data has been limited by its computationally intensive nature, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Liefeld, Ted, Huang, Edwin, Wenzel, Alexander T., Yoshimoto, Kenneth, Sharma, Ashwyn K, Sicklick, Jason K, Mesirov, Jill P, Reich, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312797/
https://www.ncbi.nlm.nih.gov/pubmed/37398372
http://dx.doi.org/10.1101/2023.06.16.545370
_version_ 1785066988304859136
author Liefeld, Ted
Huang, Edwin
Wenzel, Alexander T.
Yoshimoto, Kenneth
Sharma, Ashwyn K
Sicklick, Jason K
Mesirov, Jill P
Reich, Michael
author_facet Liefeld, Ted
Huang, Edwin
Wenzel, Alexander T.
Yoshimoto, Kenneth
Sharma, Ashwyn K
Sicklick, Jason K
Mesirov, Jill P
Reich, Michael
author_sort Liefeld, Ted
collection PubMed
description Non-negative Matrix Factorization (NME) is an algorithm that can reduce high dimensional datasets of tens of thousands of genes to a handful of metagenes which are biologically easier to interpret. Application of NMF on gene expression data has been limited by its computationally intensive nature, which hinders its use on large datasets such as single-cell RNA sequencing (scRNA-seq) count matrices. We have implemented NMF based clustering to run on high performance GPU compute nodes using Cupy, a GPU backed python library, and the Message Passing Interface (MPI). This reduces the computation time by up to three orders of magnitude and makes the NMF Clustering analysis of large RNA-Seq and scRNA-seq datasets practical. We have made the method freely available through the GenePatten gateway, which provides free public access to hundreds of tools for the analysis and visualization of multiple ‘omic data types. Its web-based interface gives easy access to these tools and allows the creation of multi-step analysis pipelnes on high performance computing (HPC) culsters that enable reproducible in silco research for non-programmers.
format Online
Article
Text
id pubmed-10312797
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-103127972023-07-01 NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration Liefeld, Ted Huang, Edwin Wenzel, Alexander T. Yoshimoto, Kenneth Sharma, Ashwyn K Sicklick, Jason K Mesirov, Jill P Reich, Michael bioRxiv Article Non-negative Matrix Factorization (NME) is an algorithm that can reduce high dimensional datasets of tens of thousands of genes to a handful of metagenes which are biologically easier to interpret. Application of NMF on gene expression data has been limited by its computationally intensive nature, which hinders its use on large datasets such as single-cell RNA sequencing (scRNA-seq) count matrices. We have implemented NMF based clustering to run on high performance GPU compute nodes using Cupy, a GPU backed python library, and the Message Passing Interface (MPI). This reduces the computation time by up to three orders of magnitude and makes the NMF Clustering analysis of large RNA-Seq and scRNA-seq datasets practical. We have made the method freely available through the GenePatten gateway, which provides free public access to hundreds of tools for the analysis and visualization of multiple ‘omic data types. Its web-based interface gives easy access to these tools and allows the creation of multi-step analysis pipelnes on high performance computing (HPC) culsters that enable reproducible in silco research for non-programmers. Cold Spring Harbor Laboratory 2023-06-27 /pmc/articles/PMC10312797/ /pubmed/37398372 http://dx.doi.org/10.1101/2023.06.16.545370 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Liefeld, Ted
Huang, Edwin
Wenzel, Alexander T.
Yoshimoto, Kenneth
Sharma, Ashwyn K
Sicklick, Jason K
Mesirov, Jill P
Reich, Michael
NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title_full NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title_fullStr NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title_full_unstemmed NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title_short NMFClustering: Accessible NMF-based clustering utilizing GPU acceleration
title_sort nmfclustering: accessible nmf-based clustering utilizing gpu acceleration
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312797/
https://www.ncbi.nlm.nih.gov/pubmed/37398372
http://dx.doi.org/10.1101/2023.06.16.545370
work_keys_str_mv AT liefeldted nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT huangedwin nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT wenzelalexandert nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT yoshimotokenneth nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT sharmaashwynk nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT sicklickjasonk nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT mesirovjillp nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration
AT reichmichael nmfclusteringaccessiblenmfbasedclusteringutilizinggpuacceleration