Cargando…

Semi‐supervised empirical Bayes group‐regularized factor regression

The features in a high‐dimensional biomedical prediction problem are often well described by low‐dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is of...

Descripción completa

Detalles Bibliográficos
Autores principales: Münch, Magnus M., van de Wiel, Mark A., van der Vaart, Aad W., Peeters, Carel F. W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9796498/
https://www.ncbi.nlm.nih.gov/pubmed/35730912
http://dx.doi.org/10.1002/bimj.202100105
Descripción
Sumario:The features in a high‐dimensional biomedical prediction problem are often well described by low‐dimensional latent variables (or factors). We use this to include unlabeled features and additional information on the features when building a prediction model. Such additional feature information is often available in biomedical applications. Examples are annotation of genes, metabolites, or p‐values from a previous study. We employ a Bayesian factor regression model that jointly models the features and the outcome using Gaussian latent variables. We fit the model using a computationally efficient variational Bayes method, which scales to high dimensions. We use the extra information to set up a prior model for the features in terms of hyperparameters, which are then estimated through empirical Bayes. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predicts oral cancer metastasis from RNAseq data.