Cargando…

Unmasking Clever Hans predictors and assessing what machines really learn

Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade...

Descripción completa

Detalles Bibliográficos
Autores principales: Lapuschkin, Sebastian, Wäldchen, Stephan, Binder, Alexander, Montavon, Grégoire, Samek, Wojciech, Müller, Klaus-Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411769/
https://www.ncbi.nlm.nih.gov/pubmed/30858366
http://dx.doi.org/10.1038/s41467-019-08987-4
_version_ 1783402450061885440
author Lapuschkin, Sebastian
Wäldchen, Stephan
Binder, Alexander
Montavon, Grégoire
Samek, Wojciech
Müller, Klaus-Robert
author_facet Lapuschkin, Sebastian
Wäldchen, Stephan
Binder, Alexander
Montavon, Grégoire
Samek, Wojciech
Müller, Klaus-Robert
author_sort Lapuschkin, Sebastian
collection PubMed
description Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.
format Online
Article
Text
id pubmed-6411769
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-64117692019-03-13 Unmasking Clever Hans predictors and assessing what machines really learn Lapuschkin, Sebastian Wäldchen, Stephan Binder, Alexander Montavon, Grégoire Samek, Wojciech Müller, Klaus-Robert Nat Commun Article Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner. Nature Publishing Group UK 2019-03-11 /pmc/articles/PMC6411769/ /pubmed/30858366 http://dx.doi.org/10.1038/s41467-019-08987-4 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Lapuschkin, Sebastian
Wäldchen, Stephan
Binder, Alexander
Montavon, Grégoire
Samek, Wojciech
Müller, Klaus-Robert
Unmasking Clever Hans predictors and assessing what machines really learn
title Unmasking Clever Hans predictors and assessing what machines really learn
title_full Unmasking Clever Hans predictors and assessing what machines really learn
title_fullStr Unmasking Clever Hans predictors and assessing what machines really learn
title_full_unstemmed Unmasking Clever Hans predictors and assessing what machines really learn
title_short Unmasking Clever Hans predictors and assessing what machines really learn
title_sort unmasking clever hans predictors and assessing what machines really learn
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6411769/
https://www.ncbi.nlm.nih.gov/pubmed/30858366
http://dx.doi.org/10.1038/s41467-019-08987-4
work_keys_str_mv AT lapuschkinsebastian unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn
AT waldchenstephan unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn
AT binderalexander unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn
AT montavongregoire unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn
AT samekwojciech unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn
AT mullerklausrobert unmaskingcleverhanspredictorsandassessingwhatmachinesreallylearn