Cargando…

Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data

Latent Variable Models (LVMs) are well established tools to accomplish a range of different data processing tasks. Applications exploit the ability of LVMs to identify latent data structure in order to improve data (e.g., through denoising) or to estimate the relation between latent causes and measu...

Descripción completa

Detalles Bibliográficos
Autores principales: Mousavi, Hamid, Buhl, Mareike, Guiraud, Enrico, Drefs, Jakob, Lücke, Jörg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145930/
https://www.ncbi.nlm.nih.gov/pubmed/33947060
http://dx.doi.org/10.3390/e23050552
_version_ 1783697282737111040
author Mousavi, Hamid
Buhl, Mareike
Guiraud, Enrico
Drefs, Jakob
Lücke, Jörg
author_facet Mousavi, Hamid
Buhl, Mareike
Guiraud, Enrico
Drefs, Jakob
Lücke, Jörg
author_sort Mousavi, Hamid
collection PubMed
description Latent Variable Models (LVMs) are well established tools to accomplish a range of different data processing tasks. Applications exploit the ability of LVMs to identify latent data structure in order to improve data (e.g., through denoising) or to estimate the relation between latent causes and measurements in medical data. In the latter case, LVMs in the form of noisy-OR Bayes nets represent the standard approach to relate binary latents (which represent diseases) to binary observables (which represent symptoms). Bayes nets with binary representation for symptoms may be perceived as a coarse approximation, however. In practice, real disease symptoms can range from absent over mild and intermediate to very severe. Therefore, using diseases/symptoms relations as motivation, we here ask how standard noisy-OR Bayes nets can be generalized to incorporate continuous observables, e.g., variables that model symptom severity in an interval from healthy to pathological. This transition from binary to interval data poses a number of challenges including a transition from a Bernoulli to a Beta distribution to model symptom statistics. While noisy-OR-like approaches are constrained to model how causes determine the observables’ mean values, the use of Beta distributions additionally provides (and also requires) that the causes determine the observables’ variances. To meet the challenges emerging when generalizing from Bernoulli to Beta distributed observables, we investigate a novel LVM that uses a maximum non-linearity to model how the latents determine means and variances of the observables. Given the model and the goal of likelihood maximization, we then leverage recent theoretical results to derive an Expectation Maximization (EM) algorithm for the suggested LVM. We further show how variational EM can be used to efficiently scale the approach to large networks. Experimental results finally illustrate the efficacy of the proposed model using both synthetic and real data sets. Importantly, we show that the model produces reliable results in estimating causes using proofs of concepts and first tests based on real medical data and on images.
format Online
Article
Text
id pubmed-8145930
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81459302021-05-26 Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data Mousavi, Hamid Buhl, Mareike Guiraud, Enrico Drefs, Jakob Lücke, Jörg Entropy (Basel) Article Latent Variable Models (LVMs) are well established tools to accomplish a range of different data processing tasks. Applications exploit the ability of LVMs to identify latent data structure in order to improve data (e.g., through denoising) or to estimate the relation between latent causes and measurements in medical data. In the latter case, LVMs in the form of noisy-OR Bayes nets represent the standard approach to relate binary latents (which represent diseases) to binary observables (which represent symptoms). Bayes nets with binary representation for symptoms may be perceived as a coarse approximation, however. In practice, real disease symptoms can range from absent over mild and intermediate to very severe. Therefore, using diseases/symptoms relations as motivation, we here ask how standard noisy-OR Bayes nets can be generalized to incorporate continuous observables, e.g., variables that model symptom severity in an interval from healthy to pathological. This transition from binary to interval data poses a number of challenges including a transition from a Bernoulli to a Beta distribution to model symptom statistics. While noisy-OR-like approaches are constrained to model how causes determine the observables’ mean values, the use of Beta distributions additionally provides (and also requires) that the causes determine the observables’ variances. To meet the challenges emerging when generalizing from Bernoulli to Beta distributed observables, we investigate a novel LVM that uses a maximum non-linearity to model how the latents determine means and variances of the observables. Given the model and the goal of likelihood maximization, we then leverage recent theoretical results to derive an Expectation Maximization (EM) algorithm for the suggested LVM. We further show how variational EM can be used to efficiently scale the approach to large networks. Experimental results finally illustrate the efficacy of the proposed model using both synthetic and real data sets. Importantly, we show that the model produces reliable results in estimating causes using proofs of concepts and first tests based on real medical data and on images. MDPI 2021-04-29 /pmc/articles/PMC8145930/ /pubmed/33947060 http://dx.doi.org/10.3390/e23050552 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Mousavi, Hamid
Buhl, Mareike
Guiraud, Enrico
Drefs, Jakob
Lücke, Jörg
Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title_full Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title_fullStr Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title_full_unstemmed Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title_short Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data
title_sort inference and learning in a latent variable model for beta distributed interval data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145930/
https://www.ncbi.nlm.nih.gov/pubmed/33947060
http://dx.doi.org/10.3390/e23050552
work_keys_str_mv AT mousavihamid inferenceandlearninginalatentvariablemodelforbetadistributedintervaldata
AT buhlmareike inferenceandlearninginalatentvariablemodelforbetadistributedintervaldata
AT guiraudenrico inferenceandlearninginalatentvariablemodelforbetadistributedintervaldata
AT drefsjakob inferenceandlearninginalatentvariablemodelforbetadistributedintervaldata
AT luckejorg inferenceandlearninginalatentvariablemodelforbetadistributedintervaldata