Cargando…

SMSSVD: SubMatrix Selection Singular Value Decomposition

MOTIVATION: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statisti...

Descripción completa

Detalles Bibliográficos
Autores principales: Henningsson, Rasmus, Fontes, Magnus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361234/
https://www.ncbi.nlm.nih.gov/pubmed/30010791
http://dx.doi.org/10.1093/bioinformatics/bty566
_version_ 1783392654091878400
author Henningsson, Rasmus
Fontes, Magnus
author_facet Henningsson, Rasmus
Fontes, Magnus
author_sort Henningsson, Rasmus
collection PubMed
description MOTIVATION: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features. We here introduce SubMatrix Selection Singular Value Decomposition (SMSSVD), a parameter-free unsupervised signal decomposition and dimension reduction method, designed to reduce noise, adaptively for each low-rank-signal in a given data matrix, and represent the signals in the data in a way that enable unbiased exploratory analysis and reconstruction of multiple overlaid signals, including identifying groups of variables that drive different signals. RESULTS: The SMSSVD method produces a denoised signal decomposition from a given data matrix. It also guarantees orthogonality between signal components in a straightforward manner and it is designed to make automation possible. We illustrate SMSSVD by applying it to several real and synthetic datasets and compare its performance to golden standard methods like PCA (Principal Component Analysis) and SPC (Sparse Principal Components, using Lasso constraints). The SMSSVD is computationally efficient and despite being a parameter-free method, in general, outperforms existing statistical learning methods. AVAILABILITY AND IMPLEMENTATION: A Julia implementation of SMSSVD is openly available on GitHub (https://github.com/rasmushenningsson/SubMatrixSelectionSVD.jl). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6361234
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63612342019-02-08 SMSSVD: SubMatrix Selection Singular Value Decomposition Henningsson, Rasmus Fontes, Magnus Bioinformatics Original Papers MOTIVATION: High throughput biomedical measurements normally capture multiple overlaid biologically relevant signals and often also signals representing different types of technical artefacts like e.g. batch effects. Signal identification and decomposition are accordingly main objectives in statistical biomedical modeling and data analysis. Existing methods, aimed at signal reconstruction and deconvolution, in general, are either supervised, contain parameters that need to be estimated or present other types of ad hoc features. We here introduce SubMatrix Selection Singular Value Decomposition (SMSSVD), a parameter-free unsupervised signal decomposition and dimension reduction method, designed to reduce noise, adaptively for each low-rank-signal in a given data matrix, and represent the signals in the data in a way that enable unbiased exploratory analysis and reconstruction of multiple overlaid signals, including identifying groups of variables that drive different signals. RESULTS: The SMSSVD method produces a denoised signal decomposition from a given data matrix. It also guarantees orthogonality between signal components in a straightforward manner and it is designed to make automation possible. We illustrate SMSSVD by applying it to several real and synthetic datasets and compare its performance to golden standard methods like PCA (Principal Component Analysis) and SPC (Sparse Principal Components, using Lasso constraints). The SMSSVD is computationally efficient and despite being a parameter-free method, in general, outperforms existing statistical learning methods. AVAILABILITY AND IMPLEMENTATION: A Julia implementation of SMSSVD is openly available on GitHub (https://github.com/rasmushenningsson/SubMatrixSelectionSVD.jl). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-02-01 2018-07-13 /pmc/articles/PMC6361234/ /pubmed/30010791 http://dx.doi.org/10.1093/bioinformatics/bty566 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Henningsson, Rasmus
Fontes, Magnus
SMSSVD: SubMatrix Selection Singular Value Decomposition
title SMSSVD: SubMatrix Selection Singular Value Decomposition
title_full SMSSVD: SubMatrix Selection Singular Value Decomposition
title_fullStr SMSSVD: SubMatrix Selection Singular Value Decomposition
title_full_unstemmed SMSSVD: SubMatrix Selection Singular Value Decomposition
title_short SMSSVD: SubMatrix Selection Singular Value Decomposition
title_sort smssvd: submatrix selection singular value decomposition
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361234/
https://www.ncbi.nlm.nih.gov/pubmed/30010791
http://dx.doi.org/10.1093/bioinformatics/bty566
work_keys_str_mv AT henningssonrasmus smssvdsubmatrixselectionsingularvaluedecomposition
AT fontesmagnus smssvdsubmatrixselectionsingularvaluedecomposition