Cargando…

Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions

The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to supp...

Descripción completa

Detalles Bibliográficos
Autores principales:	Barić, Domjan, Fumić, Petar, Horvatić, Davor, Lipic, Tomislav
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912396/ https://www.ncbi.nlm.nih.gov/pubmed/33503822 http://dx.doi.org/10.3390/e23020143

_version_	1783656567930880000
author	Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav
author_facet	Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav
author_sort	Barić, Domjan
collection	PubMed
description	The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets.
format	Online Article Text
id	pubmed-7912396
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-79123962021-02-28 Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav Entropy (Basel) Article The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets. MDPI 2021-01-25 /pmc/articles/PMC7912396/ /pubmed/33503822 http://dx.doi.org/10.3390/e23020143 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title	Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title_full	Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title_fullStr	Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title_full_unstemmed	Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title_short	Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
title_sort	benchmarking attention-based interpretability of deep learning in multivariate time series predictions
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912396/ https://www.ncbi.nlm.nih.gov/pubmed/33503822 http://dx.doi.org/10.3390/e23020143
work_keys_str_mv	AT baricdomjan benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT fumicpetar benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT horvaticdavor benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT lipictomislav benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions

Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions

Ejemplares similares