Cargando…

The probabilistic backbone of data-driven complex networks: an example in climate

Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Graafland, Catharina E., Gutiérrez, José M., López, Juan M., Pazó, Diego, Rodríguez, Miguel A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359351/
https://www.ncbi.nlm.nih.gov/pubmed/32661248
http://dx.doi.org/10.1038/s41598-020-67970-y
_version_ 1783559031273553920
author Graafland, Catharina E.
Gutiérrez, José M.
López, Juan M.
Pazó, Diego
Rodríguez, Miguel A.
author_facet Graafland, Catharina E.
Gutiérrez, José M.
López, Juan M.
Pazó, Diego
Rodríguez, Miguel A.
author_sort Graafland, Catharina E.
collection PubMed
description Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in forecasting, control, and data-driven modeling of complex systems. These statistical interrelations among the very many degrees of freedom are usually represented by the so-called correlation network, constructed by establishing links between variables (nodes) with pairwise correlations above a given threshold. Here, with the climate system as an example, we revisit correlation networks from a probabilistic perspective and show that they unavoidably include much redundant information, resulting in overfitted probabilistic (Gaussian) models. As an alternative, we propose here the use of more sophisticated probabilistic Bayesian networks, developed by the machine learning community, as a data-driven modeling and prediction tool. Bayesian networks are built from data including only the (pairwise and conditional) dependencies among the variables needed to explain the data (i.e., maximizing the likelihood of the underlying probabilistic Gaussian model). This results in much simpler, sparser, non-redundant, networks still encoding the complex structure of the dataset as revealed by standard complex measures. Moreover, the networks are capable to generalize to new data and constitute a truly probabilistic backbone of the system. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long-range teleconnections relevant in the dataset, in particular those emerging in El Niño periods.
format Online
Article
Text
id pubmed-7359351
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73593512020-07-16 The probabilistic backbone of data-driven complex networks: an example in climate Graafland, Catharina E. Gutiérrez, José M. López, Juan M. Pazó, Diego Rodríguez, Miguel A. Sci Rep Article Complex systems often exhibit long-range correlations so that typical observables show statistical dependence across long distances. These teleconnections have a tremendous impact on the dynamics as they provide channels for information transport across the system and are particularly relevant in forecasting, control, and data-driven modeling of complex systems. These statistical interrelations among the very many degrees of freedom are usually represented by the so-called correlation network, constructed by establishing links between variables (nodes) with pairwise correlations above a given threshold. Here, with the climate system as an example, we revisit correlation networks from a probabilistic perspective and show that they unavoidably include much redundant information, resulting in overfitted probabilistic (Gaussian) models. As an alternative, we propose here the use of more sophisticated probabilistic Bayesian networks, developed by the machine learning community, as a data-driven modeling and prediction tool. Bayesian networks are built from data including only the (pairwise and conditional) dependencies among the variables needed to explain the data (i.e., maximizing the likelihood of the underlying probabilistic Gaussian model). This results in much simpler, sparser, non-redundant, networks still encoding the complex structure of the dataset as revealed by standard complex measures. Moreover, the networks are capable to generalize to new data and constitute a truly probabilistic backbone of the system. When applied to climate data, it is shown that Bayesian networks faithfully reveal the various long-range teleconnections relevant in the dataset, in particular those emerging in El Niño periods. Nature Publishing Group UK 2020-07-13 /pmc/articles/PMC7359351/ /pubmed/32661248 http://dx.doi.org/10.1038/s41598-020-67970-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Graafland, Catharina E.
Gutiérrez, José M.
López, Juan M.
Pazó, Diego
Rodríguez, Miguel A.
The probabilistic backbone of data-driven complex networks: an example in climate
title The probabilistic backbone of data-driven complex networks: an example in climate
title_full The probabilistic backbone of data-driven complex networks: an example in climate
title_fullStr The probabilistic backbone of data-driven complex networks: an example in climate
title_full_unstemmed The probabilistic backbone of data-driven complex networks: an example in climate
title_short The probabilistic backbone of data-driven complex networks: an example in climate
title_sort probabilistic backbone of data-driven complex networks: an example in climate
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359351/
https://www.ncbi.nlm.nih.gov/pubmed/32661248
http://dx.doi.org/10.1038/s41598-020-67970-y
work_keys_str_mv AT graaflandcatharinae theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT gutierrezjosem theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT lopezjuanm theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT pazodiego theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT rodriguezmiguela theprobabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT graaflandcatharinae probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT gutierrezjosem probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT lopezjuanm probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT pazodiego probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate
AT rodriguezmiguela probabilisticbackboneofdatadrivencomplexnetworksanexampleinclimate