Cargando…
On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis
We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concen...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304728/ http://dx.doi.org/10.1007/978-3-030-50433-5_17 |
_version_ | 1783548314737704960 |
---|---|
author | Ménard, Richard Cossette, Jean-François Deshaies-Jacques, Martin |
author_facet | Ménard, Richard Cossette, Jean-François Deshaies-Jacques, Martin |
author_sort | Ménard, Richard |
collection | PubMed |
description | We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concentrations using a large data set. We use a k-fold cross-validation approach to estimate the remaining parameters. We note that by minimizing the cross-validation cost function, we obtain the optimal parameters for an optimal Kalman gain. Combined with the innovation variance consistent with the sum of observation and background error variances in observation space, yield a scheme that estimates the true error statistics, thus minimizing the true analysis error. Overall, this yield a new error statistics formulation and estimation out-performs the older optimum interpolation scheme using isotropic covariances with optimized covariance parameters. Yet, the analysis scheme is computationally comparable to optimum interpolation and can be used in real-time operational applications. These new error statistics comes as data-driven models, were we use validation techniques that are common to machine learning. We argue that the error statistics could benefit from a machine learning approach, while the air quality model and analysis scheme derives from physics and statistics. |
format | Online Article Text |
id | pubmed-7304728 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73047282020-06-22 On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis Ménard, Richard Cossette, Jean-François Deshaies-Jacques, Martin Computational Science – ICCS 2020 Article We present a new formulation of the error covariances that derives from ensembles of model simulations, which captures terrain-dependent error correlations, without the prohibitive cost of ensemble Kalman filtering. Error variances are obtained from innovation variances empirically related to concentrations using a large data set. We use a k-fold cross-validation approach to estimate the remaining parameters. We note that by minimizing the cross-validation cost function, we obtain the optimal parameters for an optimal Kalman gain. Combined with the innovation variance consistent with the sum of observation and background error variances in observation space, yield a scheme that estimates the true error statistics, thus minimizing the true analysis error. Overall, this yield a new error statistics formulation and estimation out-performs the older optimum interpolation scheme using isotropic covariances with optimized covariance parameters. Yet, the analysis scheme is computationally comparable to optimum interpolation and can be used in real-time operational applications. These new error statistics comes as data-driven models, were we use validation techniques that are common to machine learning. We argue that the error statistics could benefit from a machine learning approach, while the air quality model and analysis scheme derives from physics and statistics. 2020-05-25 /pmc/articles/PMC7304728/ http://dx.doi.org/10.1007/978-3-030-50433-5_17 Text en © Her Majesty the Queen in Right of Canada 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Ménard, Richard Cossette, Jean-François Deshaies-Jacques, Martin On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title | On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title_full | On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title_fullStr | On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title_full_unstemmed | On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title_short | On the Complementary Role of Data Assimilation and Machine Learning: An Example Derived from Air Quality Analysis |
title_sort | on the complementary role of data assimilation and machine learning: an example derived from air quality analysis |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304728/ http://dx.doi.org/10.1007/978-3-030-50433-5_17 |
work_keys_str_mv | AT menardrichard onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis AT cossettejeanfrancois onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis AT deshaiesjacquesmartin onthecomplementaryroleofdataassimilationandmachinelearninganexamplederivedfromairqualityanalysis |