Cargando…

Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks

In this paper, we revisit the paired image-to-image translation using the conditional generative adversarial network, the so-called “Pix2Pix”, and propose efficient optimization techniques for the architecture and the training method to maximize the architecture’s performance to boost the realism of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ibrahem, Hatem, Salem, Ahmed, Kang, Hyun-Soo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9657195/ https://www.ncbi.nlm.nih.gov/pubmed/36366007 http://dx.doi.org/10.3390/s22218306

_version_	1784829631347556352
author	Ibrahem, Hatem Salem, Ahmed Kang, Hyun-Soo
author_facet	Ibrahem, Hatem Salem, Ahmed Kang, Hyun-Soo
author_sort	Ibrahem, Hatem
collection	PubMed
description	In this paper, we revisit the paired image-to-image translation using the conditional generative adversarial network, the so-called “Pix2Pix”, and propose efficient optimization techniques for the architecture and the training method to maximize the architecture’s performance to boost the realism of the generated images. We propose a generative adversarial network-based technique to create new artificial indoor scenes using a user-defined semantic segmentation map as an input to define the location, shape, and category of each object in the scene, exactly similar to Pix2Pix. We train different residual connections-based architectures of the generator and discriminator on the NYU depth-v2 dataset and a selected indoor subset from the ADE20K dataset, showing that the proposed models have fewer parameters, less computational complexity, and can generate better quality images than the state of the art methods following the same technique to generate realistic indoor images. We also prove that using extra specific labels and more training samples increases the quality of the generated images; however, the proposed residual connections-based models can learn better from small datasets (i.e., NYU depth-v2) and can improve the realism of the generated images in training on bigger datasets (i.e., ADE20K indoor subset) in comparison to Pix2Pix. The proposed method achieves an LPIPS value of 0.505 and an FID value of 81.067, generating better quality images than that produced by Pix2Pix and other recent paired Image-to-image translation methods and outperforming them in terms of LPIPS and FID.
format	Online Article Text
id	pubmed-9657195
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-96571952022-11-15 Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks Ibrahem, Hatem Salem, Ahmed Kang, Hyun-Soo Sensors (Basel) Article In this paper, we revisit the paired image-to-image translation using the conditional generative adversarial network, the so-called “Pix2Pix”, and propose efficient optimization techniques for the architecture and the training method to maximize the architecture’s performance to boost the realism of the generated images. We propose a generative adversarial network-based technique to create new artificial indoor scenes using a user-defined semantic segmentation map as an input to define the location, shape, and category of each object in the scene, exactly similar to Pix2Pix. We train different residual connections-based architectures of the generator and discriminator on the NYU depth-v2 dataset and a selected indoor subset from the ADE20K dataset, showing that the proposed models have fewer parameters, less computational complexity, and can generate better quality images than the state of the art methods following the same technique to generate realistic indoor images. We also prove that using extra specific labels and more training samples increases the quality of the generated images; however, the proposed residual connections-based models can learn better from small datasets (i.e., NYU depth-v2) and can improve the realism of the generated images in training on bigger datasets (i.e., ADE20K indoor subset) in comparison to Pix2Pix. The proposed method achieves an LPIPS value of 0.505 and an FID value of 81.067, generating better quality images than that produced by Pix2Pix and other recent paired Image-to-image translation methods and outperforming them in terms of LPIPS and FID. MDPI 2022-10-29 /pmc/articles/PMC9657195/ /pubmed/36366007 http://dx.doi.org/10.3390/s22218306 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Ibrahem, Hatem Salem, Ahmed Kang, Hyun-Soo Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title	Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title_full	Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title_fullStr	Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title_full_unstemmed	Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title_short	Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks
title_sort	exploration of semantic label decomposition and dataset size in semantic indoor scenes synthesis via optimized residual generative adversarial networks
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9657195/ https://www.ncbi.nlm.nih.gov/pubmed/36366007 http://dx.doi.org/10.3390/s22218306
work_keys_str_mv	AT ibrahemhatem explorationofsemanticlabeldecompositionanddatasetsizeinsemanticindoorscenessynthesisviaoptimizedresidualgenerativeadversarialnetworks AT salemahmed explorationofsemanticlabeldecompositionanddatasetsizeinsemanticindoorscenessynthesisviaoptimizedresidualgenerativeadversarialnetworks AT kanghyunsoo explorationofsemanticlabeldecompositionanddatasetsizeinsemanticindoorscenessynthesisviaoptimizedresidualgenerativeadversarialnetworks

Exploration of Semantic Label Decomposition and Dataset Size in Semantic Indoor Scenes Synthesis via Optimized Residual Generative Adversarial Networks

Ejemplares similares