Cargando…
Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions
The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to supp...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912396/ https://www.ncbi.nlm.nih.gov/pubmed/33503822 http://dx.doi.org/10.3390/e23020143 |
_version_ | 1783656567930880000 |
---|---|
author | Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav |
author_facet | Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav |
author_sort | Barić, Domjan |
collection | PubMed |
description | The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets. |
format | Online Article Text |
id | pubmed-7912396 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-79123962021-02-28 Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav Entropy (Basel) Article The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets. MDPI 2021-01-25 /pmc/articles/PMC7912396/ /pubmed/33503822 http://dx.doi.org/10.3390/e23020143 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Barić, Domjan Fumić, Petar Horvatić, Davor Lipic, Tomislav Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title | Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title_full | Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title_fullStr | Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title_full_unstemmed | Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title_short | Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions |
title_sort | benchmarking attention-based interpretability of deep learning in multivariate time series predictions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912396/ https://www.ncbi.nlm.nih.gov/pubmed/33503822 http://dx.doi.org/10.3390/e23020143 |
work_keys_str_mv | AT baricdomjan benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT fumicpetar benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT horvaticdavor benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions AT lipictomislav benchmarkingattentionbasedinterpretabilityofdeeplearninginmultivariatetimeseriespredictions |