Cargando…

Autoencoder and Partially Impossible Reconstruction Losses

The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularizations have been shown to improve model performanc...

Descripción completa

Detalles Bibliográficos
Autores principales: Dias Da Cruz, Steve, Taetz, Bertram, Stifter, Thomas, Stricker, Didier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9268944/
https://www.ncbi.nlm.nih.gov/pubmed/35808357
http://dx.doi.org/10.3390/s22134862
_version_ 1784744111614459904
author Dias Da Cruz, Steve
Taetz, Bertram
Stifter, Thomas
Stricker, Didier
author_facet Dias Da Cruz, Steve
Taetz, Bertram
Stifter, Thomas
Stricker, Didier
author_sort Dias Da Cruz, Steve
collection PubMed
description The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularizations have been shown to improve model performances depending on the tasks to solve and to induce new desirable properties such as disentanglement. Nevertheless, measuring the success in, or enforcing properties by, the input pixel space is a challenging endeavour. In this work, we want to make use of the available data more efficiently and provide design choices to be considered in the recording or generation of future datasets to implicitly induce desirable properties during training. To this end, we propose a new sampling technique which matches semantically important parts of the image while randomizing the other parts, leading to salient feature extraction and a neglection of unimportant details. The proposed method can be combined with any existing reconstruction loss and the performance gain is superior to the triplet loss. We analyse the resulting properties on various datasets and show improvements on several computer vision tasks: illumination and unwanted features can be normalized or smoothed out and shadows are removed such that classification or other tasks work more reliably; a better invariances with respect to unwanted features is induced; the generalization capacities from synthetic to real images is improved, such that more of the semantics are preserved; uncertainty estimation is superior to Monte Carlo Dropout and an ensemble of models, particularly for datasets of higher visual complexity. Finally, classification accuracy by means of simple linear classifiers in the latent space is improved compared to the triplet loss. For each task, the improvements are highlighted on several datasets commonly used by the research community, as well as in automotive applications.
format Online
Article
Text
id pubmed-9268944
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92689442022-07-09 Autoencoder and Partially Impossible Reconstruction Losses Dias Da Cruz, Steve Taetz, Bertram Stifter, Thomas Stricker, Didier Sensors (Basel) Article The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularizations have been shown to improve model performances depending on the tasks to solve and to induce new desirable properties such as disentanglement. Nevertheless, measuring the success in, or enforcing properties by, the input pixel space is a challenging endeavour. In this work, we want to make use of the available data more efficiently and provide design choices to be considered in the recording or generation of future datasets to implicitly induce desirable properties during training. To this end, we propose a new sampling technique which matches semantically important parts of the image while randomizing the other parts, leading to salient feature extraction and a neglection of unimportant details. The proposed method can be combined with any existing reconstruction loss and the performance gain is superior to the triplet loss. We analyse the resulting properties on various datasets and show improvements on several computer vision tasks: illumination and unwanted features can be normalized or smoothed out and shadows are removed such that classification or other tasks work more reliably; a better invariances with respect to unwanted features is induced; the generalization capacities from synthetic to real images is improved, such that more of the semantics are preserved; uncertainty estimation is superior to Monte Carlo Dropout and an ensemble of models, particularly for datasets of higher visual complexity. Finally, classification accuracy by means of simple linear classifiers in the latent space is improved compared to the triplet loss. For each task, the improvements are highlighted on several datasets commonly used by the research community, as well as in automotive applications. MDPI 2022-06-27 /pmc/articles/PMC9268944/ /pubmed/35808357 http://dx.doi.org/10.3390/s22134862 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Dias Da Cruz, Steve
Taetz, Bertram
Stifter, Thomas
Stricker, Didier
Autoencoder and Partially Impossible Reconstruction Losses
title Autoencoder and Partially Impossible Reconstruction Losses
title_full Autoencoder and Partially Impossible Reconstruction Losses
title_fullStr Autoencoder and Partially Impossible Reconstruction Losses
title_full_unstemmed Autoencoder and Partially Impossible Reconstruction Losses
title_short Autoencoder and Partially Impossible Reconstruction Losses
title_sort autoencoder and partially impossible reconstruction losses
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9268944/
https://www.ncbi.nlm.nih.gov/pubmed/35808357
http://dx.doi.org/10.3390/s22134862
work_keys_str_mv AT diasdacruzsteve autoencoderandpartiallyimpossiblereconstructionlosses
AT taetzbertram autoencoderandpartiallyimpossiblereconstructionlosses
AT stifterthomas autoencoderandpartiallyimpossiblereconstructionlosses
AT strickerdidier autoencoderandpartiallyimpossiblereconstructionlosses