Cargando…

Efficient attention-based deep encoder and decoder for automatic crack segmentation

Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an eva...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kang, Dong H, Cha, Young-Jin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2021
Materias:	Original Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9411784/ https://www.ncbi.nlm.nih.gov/pubmed/36039173 http://dx.doi.org/10.1177/14759217211053776

_version_	1784775341173112832
author	Kang, Dong H Cha, Young-Jin
author_facet	Kang, Dong H Cha, Young-Jin
author_sort	Kang, Dong H
collection	PubMed
description	Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.
format	Online Article Text
id	pubmed-9411784
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-94117842022-08-27 Efficient attention-based deep encoder and decoder for automatic crack segmentation Kang, Dong H Cha, Young-Jin Struct Health Monit Original Articles Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second. SAGE Publications 2021-12-19 2022-09 /pmc/articles/PMC9411784/ /pubmed/36039173 http://dx.doi.org/10.1177/14759217211053776 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Original Articles Kang, Dong H Cha, Young-Jin Efficient attention-based deep encoder and decoder for automatic crack segmentation
title	Efficient attention-based deep encoder and decoder for automatic crack segmentation
title_full	Efficient attention-based deep encoder and decoder for automatic crack segmentation
title_fullStr	Efficient attention-based deep encoder and decoder for automatic crack segmentation
title_full_unstemmed	Efficient attention-based deep encoder and decoder for automatic crack segmentation
title_short	Efficient attention-based deep encoder and decoder for automatic crack segmentation
title_sort	efficient attention-based deep encoder and decoder for automatic crack segmentation
topic	Original Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9411784/ https://www.ncbi.nlm.nih.gov/pubmed/36039173 http://dx.doi.org/10.1177/14759217211053776
work_keys_str_mv	AT kangdongh efficientattentionbaseddeepencoderanddecoderforautomaticcracksegmentation AT chayoungjin efficientattentionbaseddeepencoderanddecoderforautomaticcracksegmentation

Efficient attention-based deep encoder and decoder for automatic crack segmentation

Ejemplares similares