Cargando…
The probabilistic backbone of data-driven complex networks: an example in climate
Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in fo...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359351/ https://www.ncbi.nlm.nih.gov/pubmed/32661248 http://dx.doi.org/10.1038/s41598-020-67970-y |
_version_ | 1783559031273553920 |
---|---|
author | Graafland, Catharina E. Gutiérrez, José M. López, Juan M. Pazó, Diego Rodríguez, Miguel A. |
author_facet | Graafland, Catharina E. Gutiérrez, José M. López, Juan M. Pazó, Diego Rodríguez, Miguel A. |
author_sort | Graafland, Catharina E. |
collection | PubMed |
description | Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in forecasting, control, and data-driven modeling of complex systems. These statistical interrelations among the very many degrees of freedom are usually represented by the so-called correlation network, constructed by establishing links between variables (nodes) with pairwise correlations above a given threshold. Here, with the climate system as an example, we revisit correlation networks from a probabilistic perspective and show that they unavoidably include much redundant information, resulting in overfitted probabilistic (Gaussian) models. As an alternative, we propose here the use of more sophisticated probabilistic Bayesian networks, developed by the machine learning community, as a data-driven modeling and prediction tool. Bayesian networks are built from data including only the (pairwise and conditional) dependencies among the variables needed to explain the data (i.e., maximizing the likelihood of the underlying probabilistic Gaussian model). This results in much simpler, sparser, non-redundant, networks still encoding the complex structure of the dataset as revealed by standard complex measures. Moreover, the networks are capable to generalize to new data and constitute a truly probabilistic backbone of the system. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long-range teleconnections relevant in the dataset, in particular those emerging in El Niño periods. |
format | Online Article Text |
id | pubmed-7359351 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-73593512020-07-16 The probabilistic backbone of data-driven complex networks: an example in climate Graafland, Catharina E. Gutiérrez, José M. López, Juan M. Pazó, Diego Rodríguez, Miguel A. Sci Rep Article Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in forecasting, control, and data-driven modeling of complex systems. These statistical interrelations among the very many degrees of freedom are usually represented by the so-called correlation network, constructed by establishing links between variables (nodes) with pairwise correlations above a given threshold. Here, with the climate system as an example, we revisit correlation networks from a probabilistic perspective and show that they unavoidably include much redundant information, resulting in overfitted probabilistic (Gaussian) models. As an alternative, we propose here the use of more sophisticated probabilistic Bayesian networks, developed by the machine learning community, as a data-driven modeling and prediction tool. Bayesian networks are built from data including only the (pairwise and conditional) dependencies among the variables needed to explain the data (i.e., maximizing the likelihood of the underlying probabilistic Gaussian model). This results in much simpler, sparser, non-redundant, networks still encoding the complex structure of the dataset as revealed by standard complex measures. Moreover, the networks are capable to generalize to new data and constitute a truly probabilistic backbone of the system. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long-range teleconnections relevant in the dataset, in particular those emerging in El Niño periods. Nature Publishing Group UK 2020-07-13 /pmc/articles/PMC7359351/ /pubmed/32661248 http://dx.doi.org/10.1038/s41598-020-67970-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Graafland, Catharina E. Gutiérrez, José M. López, Juan M. Pazó, Diego Rodríguez, Miguel A. The probabilistic backbone of data-driven complex networks: an example in climate |
title | The probabilistic backbone of data-driven complex networks: an example in climate |
title_full | The probabilistic backbone of data-driven complex networks: an example in climate |
title_fullStr | The probabilistic backbone of data-driven complex networks: an example in climate |
title_full_unstemmed | The probabilistic backbone of data-driven complex networks: an example in climate |
title_short | The probabilistic backbone of data-driven complex networks: an example in climate |
title_sort | probabilistic backbone of data-driven complex networks: an example in climate |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359351/ https://www.ncbi.nlm.nih.gov/pubmed/32661248 http://dx.doi.org/10.1038/s41598-020-67970-y |
work_keys_str_mv | AT graaflandcatharinae theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT gutierrezjosem theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT lopezjuanm theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT pazodiego theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT rodriguezmiguela theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT graaflandcatharinae probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT gutierrezjosem probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT lopezjuanm probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT pazodiego probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate AT rodriguezmiguela probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate |