Cargando…

Semi‐supervised empirical Bayes group‐regularized factor regression

The features in a high‐dimensional biomedical prediction problem are often well described by low‐dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is of...

Descripción completa

Detalles Bibliográficos
Autores principales: Münch, Magnus M., van de Wiel, Mark A., van der Vaart, Aad W., Peeters, Carel F. W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9796498/
https://www.ncbi.nlm.nih.gov/pubmed/35730912
http://dx.doi.org/10.1002/bimj.202100105
_version_ 1784860499513442304
author Münch, Magnus M.
van de Wiel, Mark A.
van der Vaart, Aad W.
Peeters, Carel F. W.
author_facet Münch, Magnus M.
van de Wiel, Mark A.
van der Vaart, Aad W.
Peeters, Carel F. W.
author_sort Münch, Magnus M.
collection PubMed
description The features in a high‐dimensional biomedical prediction problem are often well described by low‐dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is often available in biomedical applications. Examples are annotation of genes, metabolites, or p‐values from a previous study. We employ a Bayesian factor regression model that jointly models the features and the outcome using Gaussian latent variables. We fit the model using a computationally efficient variational Bayes method, which scales to high dimensions. We use the extra information to set up a prior model for the features in terms of hyperparameters, which are then estimated through empirical Bayes. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predicts oral cancer metastasis from RNAseq data.
format Online
Article
Text
id pubmed-9796498
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-97964982022-12-30 Semi‐supervised empirical Bayes group‐regularized factor regression Münch, Magnus M. van de Wiel, Mark A. van der Vaart, Aad W. Peeters, Carel F. W. Biom J Statistical Modeling The features in a high‐dimensional biomedical prediction problem are often well described by low‐dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is often available in biomedical applications. Examples are annotation of genes, metabolites, or p‐values from a previous study. We employ a Bayesian factor regression model that jointly models the features and the outcome using Gaussian latent variables. We fit the model using a computationally efficient variational Bayes method, which scales to high dimensions. We use the extra information to set up a prior model for the features in terms of hyperparameters, which are then estimated through empirical Bayes. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predicts oral cancer metastasis from RNAseq data. John Wiley and Sons Inc. 2022-06-22 2022-10 /pmc/articles/PMC9796498/ /pubmed/35730912 http://dx.doi.org/10.1002/bimj.202100105 Text en © 2022 The Authors. Biometrical Journal published by Wiley‐VCH GmbH. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Statistical Modeling
Münch, Magnus M.
van de Wiel, Mark A.
van der Vaart, Aad W.
Peeters, Carel F. W.
Semi‐supervised empirical Bayes group‐regularized factor regression
title Semi‐supervised empirical Bayes group‐regularized factor regression
title_full Semi‐supervised empirical Bayes group‐regularized factor regression
title_fullStr Semi‐supervised empirical Bayes group‐regularized factor regression
title_full_unstemmed Semi‐supervised empirical Bayes group‐regularized factor regression
title_short Semi‐supervised empirical Bayes group‐regularized factor regression
title_sort semi‐supervised empirical bayes group‐regularized factor regression
topic Statistical Modeling
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9796498/
https://www.ncbi.nlm.nih.gov/pubmed/35730912
http://dx.doi.org/10.1002/bimj.202100105
work_keys_str_mv AT munchmagnusm semisupervisedempiricalbayesgroupregularizedfactorregression
AT vandewielmarka semisupervisedempiricalbayesgroupregularizedfactorregression
AT vandervaartaadw semisupervisedempiricalbayesgroupregularizedfactorregression
AT peeterscarelfw semisupervisedempiricalbayesgroupregularizedfactorregression