Cargando…

Structural inference embedded adversarial networks for scene parsing

Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a no...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, ZeYu, Wu, YanXia, Bu, ShuHui, Han, PengCheng, Zhang, GuoYin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896926/
https://www.ncbi.nlm.nih.gov/pubmed/29649294
http://dx.doi.org/10.1371/journal.pone.0195114
_version_ 1783313887748161536
author Wang, ZeYu
Wu, YanXia
Bu, ShuHui
Han, PengCheng
Zhang, GuoYin
author_facet Wang, ZeYu
Wu, YanXia
Bu, ShuHui
Han, PengCheng
Zhang, GuoYin
author_sort Wang, ZeYu
collection PubMed
description Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.
format Online
Article
Text
id pubmed-5896926
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58969262018-05-04 Structural inference embedded adversarial networks for scene parsing Wang, ZeYu Wu, YanXia Bu, ShuHui Han, PengCheng Zhang, GuoYin PLoS One Research Article Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods. Public Library of Science 2018-04-12 /pmc/articles/PMC5896926/ /pubmed/29649294 http://dx.doi.org/10.1371/journal.pone.0195114 Text en © 2018 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wang, ZeYu
Wu, YanXia
Bu, ShuHui
Han, PengCheng
Zhang, GuoYin
Structural inference embedded adversarial networks for scene parsing
title Structural inference embedded adversarial networks for scene parsing
title_full Structural inference embedded adversarial networks for scene parsing
title_fullStr Structural inference embedded adversarial networks for scene parsing
title_full_unstemmed Structural inference embedded adversarial networks for scene parsing
title_short Structural inference embedded adversarial networks for scene parsing
title_sort structural inference embedded adversarial networks for scene parsing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896926/
https://www.ncbi.nlm.nih.gov/pubmed/29649294
http://dx.doi.org/10.1371/journal.pone.0195114
work_keys_str_mv AT wangzeyu structuralinferenceembeddedadversarialnetworksforsceneparsing
AT wuyanxia structuralinferenceembeddedadversarialnetworksforsceneparsing
AT bushuhui structuralinferenceembeddedadversarialnetworksforsceneparsing
AT hanpengcheng structuralinferenceembeddedadversarialnetworksforsceneparsing
AT zhangguoyin structuralinferenceembeddedadversarialnetworksforsceneparsing