Cargando…

CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models

We propose CX-ToM, short for counterfactual explanations with theory-of-mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose expla...

Descripción completa

Detalles Bibliográficos
Autores principales:	Akula, Arjun R., Wang, Keze, Liu, Changsong, Saba-Sadiya, Sari, Lu, Hongjing, Todorovic, Sinisa, Chai, Joyce, Zhu, Song-Chun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753121/ https://www.ncbi.nlm.nih.gov/pubmed/35036861 http://dx.doi.org/10.1016/j.isci.2021.103581

_version_	1784632025928433664
author	Akula, Arjun R. Wang, Keze Liu, Changsong Saba-Sadiya, Sari Lu, Hongjing Todorovic, Sinisa Chai, Joyce Zhu, Song-Chun
author_facet	Akula, Arjun R. Wang, Keze Liu, Changsong Saba-Sadiya, Sari Lu, Hongjing Todorovic, Sinisa Chai, Joyce Zhu, Song-Chun
author_sort	Akula, Arjun R.
collection	PubMed
description	We propose CX-ToM, short for counterfactual explanations with theory-of-mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e., dialogue between the machine and human user. More concretely, our CX-ToM framework generates a sequence of explanations in a dialogue by mediating the differences between the minds of the machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling the human’s intention, the machine’s mind as inferred by the human, as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention-based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class c(pred), a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra), referred to as explainable concepts, that need to be added to or deleted from I to alter the classification category of I by M to another specified class c(alt). Extensive experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art XAI models.
format	Online Article Text
id	pubmed-8753121
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-87531212022-01-14 CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models Akula, Arjun R. Wang, Keze Liu, Changsong Saba-Sadiya, Sari Lu, Hongjing Todorovic, Sinisa Chai, Joyce Zhu, Song-Chun iScience Article We propose CX-ToM, short for counterfactual explanations with theory-of-mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e., dialogue between the machine and human user. More concretely, our CX-ToM framework generates a sequence of explanations in a dialogue by mediating the differences between the minds of the machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling the human’s intention, the machine’s mind as inferred by the human, as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention-based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class c(pred), a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra), referred to as explainable concepts, that need to be added to or deleted from I to alter the classification category of I by M to another specified class c(alt). Extensive experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art XAI models. Elsevier 2021-12-11 /pmc/articles/PMC8753121/ /pubmed/35036861 http://dx.doi.org/10.1016/j.isci.2021.103581 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Akula, Arjun R. Wang, Keze Liu, Changsong Saba-Sadiya, Sari Lu, Hongjing Todorovic, Sinisa Chai, Joyce Zhu, Song-Chun CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title	CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title_full	CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title_fullStr	CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title_full_unstemmed	CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title_short	CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
title_sort	cx-tom: counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8753121/ https://www.ncbi.nlm.nih.gov/pubmed/35036861 http://dx.doi.org/10.1016/j.isci.2021.103581
work_keys_str_mv	AT akulaarjunr cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT wangkeze cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT liuchangsong cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT sabasadiyasari cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT luhongjing cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT todorovicsinisa cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT chaijoyce cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels AT zhusongchun cxtomcounterfactualexplanationswiththeoryofmindforenhancinghumantrustinimagerecognitionmodels

CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models

Ejemplares similares