Cargando…

From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach

In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of bioma...

Descripción completa

Detalles Bibliográficos
Autores principales: Becker, Ann-Kristin, Dörr, Marcus, Felix, Stephan B., Frost, Fabian, Grabe, Hans J., Lerch, Markus M., Nauck, Matthias, Völker, Uwe, Völzke, Henry, Kaderali, Lars
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906470/
https://www.ncbi.nlm.nih.gov/pubmed/33577591
http://dx.doi.org/10.1371/journal.pcbi.1008735
_version_ 1783655295191351296
author Becker, Ann-Kristin
Dörr, Marcus
Felix, Stephan B.
Frost, Fabian
Grabe, Hans J.
Lerch, Markus M.
Nauck, Matthias
Völker, Uwe
Völzke, Henry
Kaderali, Lars
author_facet Becker, Ann-Kristin
Dörr, Marcus
Felix, Stephan B.
Frost, Fabian
Grabe, Hans J.
Lerch, Markus M.
Nauck, Matthias
Völker, Uwe
Völzke, Henry
Kaderali, Lars
author_sort Becker, Ann-Kristin
collection PubMed
description In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of biomarkers. Usually, data require extensive manual preprocessing and dimension reduction to allow for effective learning of Bayesian networks. For heterogeneous data, this preprocessing is hard to automatize and typically requires domain-specific prior knowledge. We here combine Bayesian network learning with hierarchical variable clustering in order to detect groups of similar features and learn interactions between them entirely automated. We present an optimization algorithm for the adaptive refinement of such group Bayesian networks to account for a specific target variable, like a disease. The combination of Bayesian networks, clustering, and refinement yields low-dimensional but disease-specific interaction networks. These networks provide easily interpretable, yet accurate models of biomarker interdependencies. We test our method extensively on simulated data, as well as on data from the Study of Health in Pomerania (SHIP-TREND), and demonstrate its effectiveness using non-alcoholic fatty liver disease and hypertension as examples. We show that the group network models outperform available biomarker scores, while at the same time, they provide an easily interpretable interaction network.
format Online
Article
Text
id pubmed-7906470
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-79064702021-03-03 From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach Becker, Ann-Kristin Dörr, Marcus Felix, Stephan B. Frost, Fabian Grabe, Hans J. Lerch, Markus M. Nauck, Matthias Völker, Uwe Völzke, Henry Kaderali, Lars PLoS Comput Biol Research Article In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of biomarkers. Usually, data require extensive manual preprocessing and dimension reduction to allow for effective learning of Bayesian networks. For heterogeneous data, this preprocessing is hard to automatize and typically requires domain-specific prior knowledge. We here combine Bayesian network learning with hierarchical variable clustering in order to detect groups of similar features and learn interactions between them entirely automated. We present an optimization algorithm for the adaptive refinement of such group Bayesian networks to account for a specific target variable, like a disease. The combination of Bayesian networks, clustering, and refinement yields low-dimensional but disease-specific interaction networks. These networks provide easily interpretable, yet accurate models of biomarker interdependencies. We test our method extensively on simulated data, as well as on data from the Study of Health in Pomerania (SHIP-TREND), and demonstrate its effectiveness using non-alcoholic fatty liver disease and hypertension as examples. We show that the group network models outperform available biomarker scores, while at the same time, they provide an easily interpretable interaction network. Public Library of Science 2021-02-12 /pmc/articles/PMC7906470/ /pubmed/33577591 http://dx.doi.org/10.1371/journal.pcbi.1008735 Text en © 2021 Becker et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Becker, Ann-Kristin
Dörr, Marcus
Felix, Stephan B.
Frost, Fabian
Grabe, Hans J.
Lerch, Markus M.
Nauck, Matthias
Völker, Uwe
Völzke, Henry
Kaderali, Lars
From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title_full From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title_fullStr From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title_full_unstemmed From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title_short From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
title_sort from heterogeneous healthcare data to disease-specific biomarker networks: a hierarchical bayesian network approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906470/
https://www.ncbi.nlm.nih.gov/pubmed/33577591
http://dx.doi.org/10.1371/journal.pcbi.1008735
work_keys_str_mv AT beckerannkristin fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT dorrmarcus fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT felixstephanb fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT frostfabian fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT grabehansj fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT lerchmarkusm fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT nauckmatthias fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT volkeruwe fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT volzkehenry fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach
AT kaderalilars fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach