Cargando…

On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis

We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ménard, Richard, Cossette, Jean-François, Deshaies-Jacques, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304728/
http://dx.doi.org/10.1007/978-3-030-50433-5_17
_version_ 1783548314737704960
author Ménard, Richard
Cossette, Jean-François
Deshaies-Jacques, Martin
author_facet Ménard, Richard
Cossette, Jean-François
Deshaies-Jacques, Martin
author_sort Ménard, Richard
collection PubMed
description We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concentrations using a large data set. We use a k-fold cross-validation approach to estimate the remaining parameters. We note that by minimizing the cross-validation cost function, we obtain the optimal parameters for an optimal Kalman gain. Combined with the innovation variance consistent with the sum of observation and background error variances in observation space, yield a scheme that estimates the true error statistics, thus minimizing the true analysis error. Overall, this yield a new error statistics formulation and estimation out-performs the older optimum interpolation scheme using isotropic covariances with optimized covariance parameters. Yet, the analysis scheme is computationally comparable to optimum interpolation and can be used in real-time operational applications. These new error statistics comes as data-driven models, were we use validation techniques that are common to machine learning. We argue that the error statistics could benefit from a machine learning approach, while the air quality model and analysis scheme derives from physics and statistics.
format Online
Article
Text
id pubmed-7304728
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-73047282020-06-22 On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis Ménard, Richard Cossette, Jean-François Deshaies-Jacques, Martin Computational Science – ICCS 2020 Article We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concentrations using a large data set. We use a k-fold cross-validation approach to estimate the remaining parameters. We note that by minimizing the cross-validation cost function, we obtain the optimal parameters for an optimal Kalman gain. Combined with the innovation variance consistent with the sum of observation and background error variances in observation space, yield a scheme that estimates the true error statistics, thus minimizing the true analysis error. Overall, this yield a new error statistics formulation and estimation out-performs the older optimum interpolation scheme using isotropic covariances with optimized covariance parameters. Yet, the analysis scheme is computationally comparable to optimum interpolation and can be used in real-time operational applications. These new error statistics comes as data-driven models, were we use validation techniques that are common to machine learning. We argue that the error statistics could benefit from a machine learning approach, while the air quality model and analysis scheme derives from physics and statistics. 2020-05-25 /pmc/articles/PMC7304728/ http://dx.doi.org/10.1007/978-3-030-50433-5_17 Text en © Her Majesty the Queen in Right of Canada 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Ménard, Richard
Cossette, Jean-François
Deshaies-Jacques, Martin
On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title_full On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title_fullStr On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title_full_unstemmed On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title_short On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
title_sort on the complementary role of data assimilation and machine learning: an example derived from air quality analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304728/
http://dx.doi.org/10.1007/978-3-030-50433-5_17
work_keys_str_mv AT menardrichard onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis
AT cossettejeanfrancois onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis
AT deshaiesjacquesmartin onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis