Cargando…

Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models

An important problem with many current visio-linguistic models is that they often depend on spurious correlations. A typical example of a spurious correlation between two variables is one that is due to a third variable causing both (a “confounder”). Recent work has addressed this by adjusting for s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cornille, Nathan, Laenen, Katrien, Moens, Marie-Francine
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8993511/ https://www.ncbi.nlm.nih.gov/pubmed/35402901 http://dx.doi.org/10.3389/frai.2022.736791

_version_	1784683915913461760
author	Cornille, Nathan Laenen, Katrien Moens, Marie-Francine
author_facet	Cornille, Nathan Laenen, Katrien Moens, Marie-Francine
author_sort	Cornille, Nathan
collection	PubMed
description	An important problem with many current visio-linguistic models is that they often depend on spurious correlations. A typical example of a spurious correlation between two variables is one that is due to a third variable causing both (a “confounder”). Recent work has addressed this by adjusting for spurious correlations using a technique of deconfounding with automatically found confounders. We will refer to this technique as AutoDeconfounding. This article dives more deeply into AutoDeconfounding, and surfaces a number of issues of the original technique. First, we evaluate whether its implementation is actually equivalent to deconfounding. We provide an explicit explanation of the relation between AutoDeconfounding and the underlying causal model on which it implicitly operates, and show that additional assumptions are needed before the implementation of AutoDeconfounding can be equated to correct deconfounding. Inspired by this result, we perform ablation studies to verify to what extent the improvement on downstream visio-linguistic tasks reported by the works that implement AutoDeconfounding is due to AutoDeconfounding, and to what extent it is specifically due to the deconfounding aspect of AutoDeconfounding. We evaluate AutoDeconfounding in a way that isolates its effect, and no longer see the same improvement. We also show that tweaking AutoDeconfounding to be less related to deconfounding does not negatively affect performance on downstream visio-linguistic tasks. Furthermore, we create a human-labeled ground truth causality dataset for objects in a scene to empirically verify whether and how well confounders are found. We show that some models do indeed find more confounders than a random baseline, but also that finding more confounders is not correlated with performing better on downstream visio-linguistic tasks. Finally, we summarize the current limitations of AutoDeconfounding to solve the issue of spurious correlations and provide directions for the design of novel AutoDeconfounding methods that are aimed at overcoming these limitations.
format	Online Article Text
id	pubmed-8993511
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-89935112022-04-09 Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models Cornille, Nathan Laenen, Katrien Moens, Marie-Francine Front Artif Intell Artificial Intelligence An important problem with many current visio-linguistic models is that they often depend on spurious correlations. A typical example of a spurious correlation between two variables is one that is due to a third variable causing both (a “confounder”). Recent work has addressed this by adjusting for spurious correlations using a technique of deconfounding with automatically found confounders. We will refer to this technique as AutoDeconfounding. This article dives more deeply into AutoDeconfounding, and surfaces a number of issues of the original technique. First, we evaluate whether its implementation is actually equivalent to deconfounding. We provide an explicit explanation of the relation between AutoDeconfounding and the underlying causal model on which it implicitly operates, and show that additional assumptions are needed before the implementation of AutoDeconfounding can be equated to correct deconfounding. Inspired by this result, we perform ablation studies to verify to what extent the improvement on downstream visio-linguistic tasks reported by the works that implement AutoDeconfounding is due to AutoDeconfounding, and to what extent it is specifically due to the deconfounding aspect of AutoDeconfounding. We evaluate AutoDeconfounding in a way that isolates its effect, and no longer see the same improvement. We also show that tweaking AutoDeconfounding to be less related to deconfounding does not negatively affect performance on downstream visio-linguistic tasks. Furthermore, we create a human-labeled ground truth causality dataset for objects in a scene to empirically verify whether and how well confounders are found. We show that some models do indeed find more confounders than a random baseline, but also that finding more confounders is not correlated with performing better on downstream visio-linguistic tasks. Finally, we summarize the current limitations of AutoDeconfounding to solve the issue of spurious correlations and provide directions for the design of novel AutoDeconfounding methods that are aimed at overcoming these limitations. Frontiers Media S.A. 2022-03-17 /pmc/articles/PMC8993511/ /pubmed/35402901 http://dx.doi.org/10.3389/frai.2022.736791 Text en Copyright © 2022 Cornille, Laenen and Moens. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Cornille, Nathan Laenen, Katrien Moens, Marie-Francine Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title	Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title_full	Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title_fullStr	Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title_full_unstemmed	Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title_short	Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models
title_sort	critical analysis of deconfounded pretraining to improve visio-linguistic models
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8993511/ https://www.ncbi.nlm.nih.gov/pubmed/35402901 http://dx.doi.org/10.3389/frai.2022.736791
work_keys_str_mv	AT cornillenathan criticalanalysisofdeconfoundedpretrainingtoimprovevisiolinguisticmodels AT laenenkatrien criticalanalysisofdeconfoundedpretrainingtoimprovevisiolinguisticmodels AT moensmariefrancine criticalanalysisofdeconfoundedpretrainingtoimprovevisiolinguisticmodels

Critical Analysis of Deconfounded Pretraining to Improve Visio-Linguistic Models

Ejemplares similares