Cargando…
From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach
In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of bioma...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906470/ https://www.ncbi.nlm.nih.gov/pubmed/33577591 http://dx.doi.org/10.1371/journal.pcbi.1008735 |
_version_ | 1783655295191351296 |
---|---|
author | Becker, Ann-Kristin Dörr, Marcus Felix, Stephan B. Frost, Fabian Grabe, Hans J. Lerch, Markus M. Nauck, Matthias Völker, Uwe Völzke, Henry Kaderali, Lars |
author_facet | Becker, Ann-Kristin Dörr, Marcus Felix, Stephan B. Frost, Fabian Grabe, Hans J. Lerch, Markus M. Nauck, Matthias Völker, Uwe Völzke, Henry Kaderali, Lars |
author_sort | Becker, Ann-Kristin |
collection | PubMed |
description | In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of biomarkers. Usually, data require extensive manual preprocessing and dimension reduction to allow for effective learning of Bayesian networks. For heterogeneous data, this preprocessing is hard to automatize and typically requires domain-specific prior knowledge. We here combine Bayesian network learning with hierarchical variable clustering in order to detect groups of similar features and learn interactions between them entirely automated. We present an optimization algorithm for the adaptive refinement of such group Bayesian networks to account for a specific target variable, like a disease. The combination of Bayesian networks, clustering, and refinement yields low-dimensional but disease-specific interaction networks. These networks provide easily interpretable, yet accurate models of biomarker interdependencies. We test our method extensively on simulated data, as well as on data from the Study of Health in Pomerania (SHIP-TREND), and demonstrate its effectiveness using non-alcoholic fatty liver disease and hypertension as examples. We show that the group network models outperform available biomarker scores, while at the same time, they provide an easily interpretable interaction network. |
format | Online Article Text |
id | pubmed-7906470 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-79064702021-03-03 From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach Becker, Ann-Kristin Dörr, Marcus Felix, Stephan B. Frost, Fabian Grabe, Hans J. Lerch, Markus M. Nauck, Matthias Völker, Uwe Völzke, Henry Kaderali, Lars PLoS Comput Biol Research Article In this work, we introduce an entirely data-driven and automated approach to reveal disease-associated biomarker and risk factor networks from heterogeneous and high-dimensional healthcare data. Our workflow is based on Bayesian networks, which are a popular tool for analyzing the interplay of biomarkers. Usually, data require extensive manual preprocessing and dimension reduction to allow for effective learning of Bayesian networks. For heterogeneous data, this preprocessing is hard to automatize and typically requires domain-specific prior knowledge. We here combine Bayesian network learning with hierarchical variable clustering in order to detect groups of similar features and learn interactions between them entirely automated. We present an optimization algorithm for the adaptive refinement of such group Bayesian networks to account for a specific target variable, like a disease. The combination of Bayesian networks, clustering, and refinement yields low-dimensional but disease-specific interaction networks. These networks provide easily interpretable, yet accurate models of biomarker interdependencies. We test our method extensively on simulated data, as well as on data from the Study of Health in Pomerania (SHIP-TREND), and demonstrate its effectiveness using non-alcoholic fatty liver disease and hypertension as examples. We show that the group network models outperform available biomarker scores, while at the same time, they provide an easily interpretable interaction network. Public Library of Science 2021-02-12 /pmc/articles/PMC7906470/ /pubmed/33577591 http://dx.doi.org/10.1371/journal.pcbi.1008735 Text en © 2021 Becker et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Becker, Ann-Kristin Dörr, Marcus Felix, Stephan B. Frost, Fabian Grabe, Hans J. Lerch, Markus M. Nauck, Matthias Völker, Uwe Völzke, Henry Kaderali, Lars From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title | From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title_full | From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title_fullStr | From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title_full_unstemmed | From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title_short | From heterogeneous healthcare data to disease-specific biomarker networks: A hierarchical Bayesian network approach |
title_sort | from heterogeneous healthcare data to disease-specific biomarker networks: a hierarchical bayesian network approach |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906470/ https://www.ncbi.nlm.nih.gov/pubmed/33577591 http://dx.doi.org/10.1371/journal.pcbi.1008735 |
work_keys_str_mv | AT beckerannkristin fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT dorrmarcus fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT felixstephanb fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT frostfabian fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT grabehansj fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT lerchmarkusm fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT nauckmatthias fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT volkeruwe fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT volzkehenry fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach AT kaderalilars fromheterogeneoushealthcaredatatodiseasespecificbiomarkernetworksahierarchicalbayesiannetworkapproach |