Cargando…

A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data

RATIONALE: Non‐negative matrix factorization (NMF) has been used extensively for the analysis of mass spectrometry imaging (MSI) data, visualizing simultaneously the spatial and spectral distributions present in a slice of tissue. The statistical framework offers two related NMF methods: probabilist...

Descripción completa

Detalles Bibliográficos
Autores principales: Nijs, Melanie, Smets, Tina, Waelkens, Etienne, De Moor, Bart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9285509/
https://www.ncbi.nlm.nih.gov/pubmed/34374141
http://dx.doi.org/10.1002/rcm.9181
_version_ 1784747798804037632
author Nijs, Melanie
Smets, Tina
Waelkens, Etienne
De Moor, Bart
author_facet Nijs, Melanie
Smets, Tina
Waelkens, Etienne
De Moor, Bart
author_sort Nijs, Melanie
collection PubMed
description RATIONALE: Non‐negative matrix factorization (NMF) has been used extensively for the analysis of mass spectrometry imaging (MSI) data, visualizing simultaneously the spatial and spectral distributions present in a slice of tissue. The statistical framework offers two related NMF methods: probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA), which is a generative model. This work offers a mathematical comparison between NMF, PLSA, and LDA, and includes a detailed evaluation of Kullback–Leibler NMF (KL‐NMF) for MSI for the first time. We will inspect the results for MSI data analysis as these different mathematical approaches impose different characteristics on the data and the resulting decomposition. METHODS: The four methods (NMF, KL‐NMF, PLSA, and LDA) are compared on seven different samples: three originated from mice pancreas and four from human‐lymph‐node tissues, all obtained using matrix‐assisted laser desorption/ionization time‐of‐flight mass spectrometry (MALDI‐TOF MS). RESULTS: Where matrix factorization methods are often used for the analysis of MSI data, we find that each method has different implications on the exactness and interpretability of the results. We have discovered promising results using KL‐NMF, which has only rarely been used for MSI so far, improving both NMF and PLSA, and have shown that the hitherto stated equivalent KL‐NMF and PLSA algorithms do differ in the case of MSI data analysis. LDA, assumed to be the better method in the field of text mining, is shown to be outperformed by PLSA in the setting of MALDI‐MSI. Additionally, the molecular results of the human‐lymph‐node data have been thoroughly analyzed for better assessment of the methods under investigation. CONCLUSIONS: We present an in‐depth comparison of multiple NMF‐related factorization methods for MSI. We aim to provide fellow researchers in the field of MSI a clear understanding of the mathematical implications using each of these analytical techniques, which might affect the exactness and interpretation of the results.
format Online
Article
Text
id pubmed-9285509
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-92855092022-07-18 A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data Nijs, Melanie Smets, Tina Waelkens, Etienne De Moor, Bart Rapid Commun Mass Spectrom Research Articles RATIONALE: Non‐negative matrix factorization (NMF) has been used extensively for the analysis of mass spectrometry imaging (MSI) data, visualizing simultaneously the spatial and spectral distributions present in a slice of tissue. The statistical framework offers two related NMF methods: probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA), which is a generative model. This work offers a mathematical comparison between NMF, PLSA, and LDA, and includes a detailed evaluation of Kullback–Leibler NMF (KL‐NMF) for MSI for the first time. We will inspect the results for MSI data analysis as these different mathematical approaches impose different characteristics on the data and the resulting decomposition. METHODS: The four methods (NMF, KL‐NMF, PLSA, and LDA) are compared on seven different samples: three originated from mice pancreas and four from human‐lymph‐node tissues, all obtained using matrix‐assisted laser desorption/ionization time‐of‐flight mass spectrometry (MALDI‐TOF MS). RESULTS: Where matrix factorization methods are often used for the analysis of MSI data, we find that each method has different implications on the exactness and interpretability of the results. We have discovered promising results using KL‐NMF, which has only rarely been used for MSI so far, improving both NMF and PLSA, and have shown that the hitherto stated equivalent KL‐NMF and PLSA algorithms do differ in the case of MSI data analysis. LDA, assumed to be the better method in the field of text mining, is shown to be outperformed by PLSA in the setting of MALDI‐MSI. Additionally, the molecular results of the human‐lymph‐node data have been thoroughly analyzed for better assessment of the methods under investigation. CONCLUSIONS: We present an in‐depth comparison of multiple NMF‐related factorization methods for MSI. We aim to provide fellow researchers in the field of MSI a clear understanding of the mathematical implications using each of these analytical techniques, which might affect the exactness and interpretation of the results. John Wiley and Sons Inc. 2021-09-20 2021-11-15 /pmc/articles/PMC9285509/ /pubmed/34374141 http://dx.doi.org/10.1002/rcm.9181 Text en © 2021 The Authors. Rapid Communications in Mass Spectrometry published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Research Articles
Nijs, Melanie
Smets, Tina
Waelkens, Etienne
De Moor, Bart
A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title_full A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title_fullStr A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title_full_unstemmed A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title_short A mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
title_sort mathematical comparison of non‐negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9285509/
https://www.ncbi.nlm.nih.gov/pubmed/34374141
http://dx.doi.org/10.1002/rcm.9181
work_keys_str_mv AT nijsmelanie amathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT smetstina amathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT waelkensetienne amathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT demoorbart amathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT nijsmelanie mathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT smetstina mathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT waelkensetienne mathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata
AT demoorbart mathematicalcomparisonofnonnegativematrixfactorizationrelatedmethodswithpracticalimplicationsfortheanalysisofmassspectrometryimagingdata