Cargando…
Learning Bayesian Networks from Correlated Data
Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational st...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4857179/ https://www.ncbi.nlm.nih.gov/pubmed/27146517 http://dx.doi.org/10.1038/srep25156 |
_version_ | 1782430608998268928 |
---|---|
author | Bae, Harold Monti, Stefano Montano, Monty Steinberg, Martin H. Perls, Thomas T. Sebastiani, Paola |
author_facet | Bae, Harold Monti, Stefano Montano, Monty Steinberg, Martin H. Perls, Thomas T. Sebastiani, Paola |
author_sort | Bae, Harold |
collection | PubMed |
description | Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures. |
format | Online Article Text |
id | pubmed-4857179 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-48571792016-05-19 Learning Bayesian Networks from Correlated Data Bae, Harold Monti, Stefano Montano, Monty Steinberg, Martin H. Perls, Thomas T. Sebastiani, Paola Sci Rep Article Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures. Nature Publishing Group 2016-05-05 /pmc/articles/PMC4857179/ /pubmed/27146517 http://dx.doi.org/10.1038/srep25156 Text en Copyright © 2016, Macmillan Publishers Limited http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Bae, Harold Monti, Stefano Montano, Monty Steinberg, Martin H. Perls, Thomas T. Sebastiani, Paola Learning Bayesian Networks from Correlated Data |
title | Learning Bayesian Networks from Correlated Data |
title_full | Learning Bayesian Networks from Correlated Data |
title_fullStr | Learning Bayesian Networks from Correlated Data |
title_full_unstemmed | Learning Bayesian Networks from Correlated Data |
title_short | Learning Bayesian Networks from Correlated Data |
title_sort | learning bayesian networks from correlated data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4857179/ https://www.ncbi.nlm.nih.gov/pubmed/27146517 http://dx.doi.org/10.1038/srep25156 |
work_keys_str_mv | AT baeharold learningbayesiannetworksfromcorrelateddata AT montistefano learningbayesiannetworksfromcorrelateddata AT montanomonty learningbayesiannetworksfromcorrelateddata AT steinbergmartinh learningbayesiannetworksfromcorrelateddata AT perlsthomast learningbayesiannetworksfromcorrelateddata AT sebastianipaola learningbayesiannetworksfromcorrelateddata |